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CONDITIONS OF STATISTICAL RESEARCH* 


Smmon Kuznets 
Professor of Economics and Statistics, University of Pennsylvania 


I 


TATISTICAL DATA may be defined as numerical observations of ag- 

gregates;! statistical theory as a body of tools for use in collecting 
and analyzing such observations. Provided we include all aggregates 
subject to uncontrolled change, these definitions should be broad 
enough to encompass all the activities of statisticians. 

The role of statistical data and tools in the study of society can 
hardly be overestimated. Social aggregates are seldom subject to ex- 
periment, i.e., to the “imposition of deliberate change with the in- 
tention of studying its effects.”* The extraordinary power of this 
technique, which permits the analyst to dispense with an unwieldy 
mass of raw data and put in its place a few observations made in a 
controlled experimental situation, is not often at the disposal of the 
student of society. Perhaps if this student belonged to some non- 
human species, could manipulate the human race, and were interested 
in a few outward, observable aspects of human society, statistical data 
would prove of small moment to him. But we are members or kins of 
members of the society we study; the very directions of analysis are 
determined by our interests as members; and our power of experi- 
menting with social aggregates is exceedingly limited. Under such con- 
ditions, statistical observations are perhaps the only type of record 
that combines, if precautions are taken, freedom from human bias 
with the numerical form that permits a thoroughness of analysis other- 
wise impossible. 





* Presidential address delivered at the 109th Annual Meeting of the American Statistical Associ- 
ation on December 28, 1949. 

1 An aggregate is a collective the several parte of which are distinguishable, and can be treated as 
homogeneous and independent of the collective under which it is classified. The measure of a 
single item is not statistics unless the item is viewed as part of an aggregate. A collective entity whose 
parts are completely interdependent and interdetermined is not an aggregate. 

2 F, Yates, “Agriculture, Sampling, and Operational Research,” paper presented at the September 
1949 meeting of the International Statistical Institute, Bern, Switzerland. 
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The statement that controlled experiment is not feasible in the 
study of social aggregates has become so trite that it is accepted with- 
out full appreciation of its ramified implications. These implications 
are the main subject of this discussion, and my purpose in dwelling 
on them is to see what guidance they offer in developing a statistical 
discipline for the study of society. We discuss the conditions that gov- 
ern the statistical study of society under four heads: the supply of 
data; their reliability; the structure of the universe recorded by them; 
pressure for and obstacles to analysis. 


II 


The supply of statistics relating to society is affected in only small 
degree by the interests and views of the scholar. In areas where con- 
trolled experiment is possible, the experimenter produces his own data 
—in accordance with his analytical goals. Data on social aggregates 
are produced by society at large, and only society itself, through its 
various agencies, is in a position to collect the records. Whether these 
are gathered in a deliberate search for information—as in our Cen- 
suses—or as a byproduct of administrative activity; whether the 
social aggregate operates through the sovereign and authoritative 
organs of the state or through semi-voluntary bodies (trade associa- 
tions, trade unions, professional societies, etc.)—the production of the 
data is a social, not an individual, act. 

This social-act origin of data on social aggregates greatly affects their 
coverage. The decision to collect data for their broad, informational 
value is governed by the importance of the information for the general 
operation of society, weighed against the costs and difficulties of 
getting it—costs in both dollars and social resistance to be encoun- 
tered, and difficulties in terms of whether the respondents have the 
information at hand as a result of day-to-day activity. When the data 
are collected as a byproduct of administration, their supply depends 
first, upon the decision of society to regulate or administer some ac- 
tivity; secondly, upon the balance between the prospective returns 
and the cost of getting and making the data available. Finally, for a 
wide variety of statistics collected by private and semi-public agencies 
the determining factors are the market demand and the willingness of 
respondents to share the information—both factors reflecting the need 
for the statistics on the part of potential users. 





8 Social aggregates are not easily observable by an individual, unless he happens to be in command 
and can order the collection of data. Contrast the ease with which an individual can observe many 
natural phenomena, e.g., the rising and setting of the sun. It is perhaps this ease that led to a much 
earlier accumulation of empirical, not necessarily experimental, data on natural phenomena. The kind 
of records individuals accumulated about society in the early days was often worse than none. 
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CONDITIONS OF STATISTICAL RESEARCH 3 


The supply of statistics on social aggregates is thus determined by 
needs, administrative emphasis, and costs—all as viewed by the vari- 
ous agencies of society which, at any given time, have the power to 
decide what is important. The views of these agencies necessarily 
differ from those that would be entertained by a scientifically minded 
analyst, even one who happened to live at the same time and in the 
same place. Consequently, the supply of data is capricious as judged 
by any consistent and reasoned standard for scientific analysis. Ex- 
amples are abundant in every field of social study: in economics, with 
which I am most familiar, one can find them easily by leafing through 
the recent, valuable publication of the Census Bureau, Historical 
Statistics of the United States. As we all know, data on the foreign trade 
of this country are plentiful, and run back to the founding of the 
Republic; whereas data on domestic production of many major com- 
modities have been scanty until recent decades, and on volume of do- 
mestic trade almost completely absent until the 1930’s. We have long 
series on revenues and expenditures of the federal government, but as 
yet no reliable continuous information on the revenues and expendi- 
tures of the individuals and households that make up the nation. It 
would not be difficult to explain why the long series have been col- 
lected, even including such an esoteric item as the mackerel catch on 
the Atlantic Coast; or why data on some other, more important as- 
pects of our economic activity are missing. But understanding the 
situation is cold comfort when we are confronted with the difficulties 
of analyzing many major facets of society. 

A significant, if obvious, aspect of the situation is that the views of 
society and its agencies about the importance of a social process and 
the difficulties of measuring it change from time to time and place to 
place. In countries where the industrial and money system have spread 
most widely it has become easier in recent decades to measure many 
economic and social phenomena: life has grown more accountable 
quantitatively, and technical improvements have rendered data col- 
lection and tabulation less costly. Also, the attention paid by society 
to various social and economic problems has increased, powering a 
drive for more statistical data. In consequence, the supply of statistics 
in the industrialized countries has grown apace in recent decades, and 
is today much richer for these areas than for less industrialized coun- 
tries.5 





4 See Series F-164, p. 128. 
5 Recent developments in sampling theory and practice have not only reduced materially the 
costs of data collection but also made it feasible where formerly it was impossible; and served to yield 
assignable limita to at least some of the errors in them. This promises well for the future. The past, 
on which the scholar depends, is scarcely retrievable by these methods. 
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All this is to the good. But we must not overlook the fact that sta- 
tistical analysis of social phenomena requires data for periods marked 
by a variety of changes over time, and for social systems marked by a 
diversity of social organization. Without this variety, the analyst 
cannot distinguish the components of change and test the stability of 
patterns and relations. There is a fundamental conflict between the 
selectivity in the supply of statistical data on social aggregates— 
selectivity that confines the data largely to certain types of society 
and certain historical phases in their development—and the variety of 
coverage required by the essential purpose of statistical analysis: to 
establish variable and constant elements in the complex of change. 


Ill 


The social origin, uncontrolled experimentally, of social statistics 
obviously affects their reliability. A sketch of the sources of possible 
error may be useful in indicating one of the major tasks of a statis- 
tician concerned with the study of society. Errors may arise from lack 
of control, first, by the respondent; second, by the collecting agency; 
and, third, by the analytically minded final user. 

Almost all primary statistics on social aggregates are derived by 
some composition, if not necessarily straight addition, of reports by 
individuals, responding either as individuals or as representatives of 
some enterprise unit. The respondents are usually numerous, and 
errors in their reports may easily arise—either because they delib- 
erately falsify or because their knowledge is not full or accurate. In 
either event, the errors may be correlated with the true values of the 
characteristics. Certainly, deliberate errors of response caused by an- 
ticipation of consequences—pleasant or unpleasant—are likely to be 
systematically associated with the true magnitudes. Involuntary er- 
rors may reflect a lack of knowledge that is itself a social characteristic 
connected with the phenomenon that is recorded. The patterns and 
directions of this association between errors and the true magnitudes 
vary from group to group, time to time, and place to place. It is, 
therefore, not easy to deal with them by a broad and standard scheme: 
specific knowledge of the character of bias or of ignorance is required. 
The problem is avoided or minimized in controlled situations where 
the experimenter himself makes the observations. 

But even were all units to respond candidly and accurately, errors 
might still arise in the primary data, as compiled and made available 
by the collecting agencies. They might be committed first in the at- 
tempt at a complete count or estimate. Even complete enumerations 
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rarely cover all units that should be included; and in designedly in- 
complete surveys, the transition from presumably correct responses by 
the sample to the comprehensive totals is seldom free from serious pit- 
falls. Second, units must be grouped into broader categories; and the 
choices and decisions involved in this step may be unavoidably ar- 
bitrary. Consider, as an illustration, the problem of classifying cor- 
porations by their industrial affiliation—a basic set of data in the 
economic statistics of this country. With the most accurate and detailed 
reporting of products, difficulties would arise in placing multiproduct 
corporations in one or another industry; and the grouping of the 
responding units in the industrial categories involves arbitrary ele- 
ments and hence consciously committed errors on the part of the com- 
piling agencies. In the experimentally controlled situation, on the con- 
trary, the selected and limited body of data may not require grouping 
into categories; or can easily be allocated to those devised with special 
relevance to their unequivocal fitness. 

Both phases in the flow of primary data to the analytically minded 
user are largely irreversible. Errors in reporting can be detected post 
facto but their adjustment is at best tentative. Errors by the collecting 
and publishing agency in passing from incomplete coverage to com- 
plete estimates or in grouping units can also be probed for—yet the 
individual analyst can seldom make a complete adjustment or re- 
classification. Even were we to assume accuracy on the part of both 
respondents and collecting agencies—in terms of the definitions and 
classifications the collecting agency employs—the results might still 
reflect lack of control from the standpoint of the analyst since the con- 
cepts and classifications he would like to use, in the light of his theo- 
retical model of the universe, may differ significantly from those of the 
collecting agency. This cleavage between the collector and the analyst 
is avoided or minimized in a controlled experiment where the analyst 
sets the conditions for data collection even if he himself is not the 
collector. 

Statistics on social aggregates must necessarily follow current in- 
stitutional concepts and patterns. The very facts that the main motive 
for their collection is their use in connection with the social problems 
of the day and that the ultimate source of information is the indi- 
vidual respondent, mean that concepts and classifications are bound 
to reflect institutional reality. Unless they do, the respondents will be 
unable to supply the information, short of an improbable endowment 
with the tools and techniques of the scientific analyst; and the collect- 
ing agencies will be unable to combine and group the results so as to 
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provide links to the existing institutions—the only possible loci of 
public policy. But these institutional concepts and classifications may 
differ from those used as tools by the analyst. To iJlustrate: what busi- 
ness enterprises recognize and report as profits may well differ from 
what economic theory defines as profits; what society recognizes as 
capital or assets may differ significantly from the economist’s defini- 
tion. I do not mean to imply that the world of the analyst and the real 
world are completely apart; they must be and are closely related. Nor 
do I mean to suggest that it is impossible to obtain information re- 
flecting somewhat more sharply defined concepts and classifications 
than exist in crude reality. But the real and familiar cannot be stretched 
at will into more consistent intellectual categories; and the close rela- 
tion between the world of the mind and the world of reality does not 
preclude significant differences, where the sharply etched concepts and 
classifications of analysis cannot find any exact counterpart in 
statistics, given their origin and the necessary adherence to institu- 
tional patterns. It is in this sense that, even if complete control is 
assumed by respondents and collecting agencies, the resulting statistics 
will have a fuzziness from the viewpoint of the analyst that implies 
lack of complete reliability.® 


IV 


We turn now to the basic question of statistical theory in connection 
with the study of society—the nature of changes in observations of 
social aggregates. The first relevant statement is obvious: since experi- 
mental controls are not present, any item of statistical evidence con- 
cerning society is an historical series. This is undeniably true of any 
chronological array of successive observations. But even a frequency 
distribution, a cross-section measured at a point in time, must be 
viewed as a unit in a time series, and the components of the cross- 
section may well be subject to different patterns of change over time 
and their given magnitudes reflect different phases in these patterns of 
temporal change. 

This observation is a truism. But its implications are so far-reaching 
that they are worth exploring, even at the risk of stressing the obvious. 
The foremost is the difficulty of deriving satisfactory hypotheses that 
are susceptible of test. The statistical investigator of social aggregates 
attempts to construct models of systematic change over time embodying 





6 This statement should not, of course, preclude the possibility of bridging the gap not only by 
better analysis of empirical data but also by reformulation of concepts which, as they stand, may not 
have any empirical counterpart whatever. 
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whatever knowledge he has of the quantitative effects of factors that 
produce the changes. Not in a position to isolate these factors by ex- 
periment, he has to devise hypotheses grounded, as far as possible, 
on comparisons of historical situations, comparisons that permit him 
to distinguish within the variety of conditioning circumstances sys- 
tematic patterns that can be associated with different groups of factors 
in operation. Resolution of time series into components, distinguished 
by their patterns of change; synthesis of partial time series into broader 
units to remove some elements of variability characterizing the parts 
but not the whole; association of distinct time series to see whether the 
patterns in each have some systematic relation with one another—all 
these devices have one common aim: to establish the significant pat- 
terns of systematic change, to elaborate models of both the patterns 
and the variability that can be associated with them. The ultimate hope 
is to trace these patterns and associations to factors that can be dis- 
tinguished and observed as independent variables, and to establish 
their persistence over time and the possible effect of policy on them. 
The difficulties of deriving such hypotheses, building such explana- 
tory models, in a study of social aggregates are all too obvious; they 
reside in the complexity and changeability of social phenomena and in 
the limitations of the supply of data already discussed. Yet they must 
be solved before we can take advantage of the tools given us by current 
statistical theory for testing hypotheses. The theories of variance and 
sampling with which we are familiar are essentially grounded in situa- 
tions where controls are sufficient to eliminate the effects of the major, 
systematic bias factors, if not to produce complete stability. This 
statement is true whether we think of the classical theory or of modern 
developments in connection with small samples; whether we accept 
the maximum likelihood or the confidence-intervals approach; whether 
we think of the realistic counterpart of sampling theory in physics, 
biology, agronomy, or quality control.’ This means that as long as we 
recognize systematic, if you wish, autocorrelated, patterns of temporal 
change in historical series, and recognize them as quantitatively 
dominating, we cannot accept, until hypotheses of systematic changes 
have been thoroughly worked out, assumptions concerning vari- 





7It is of interest that the developments in sampling theory in recent decades were associated 
with the emergence of experimental techniques in new fields, which called for a reconsideration of 
theory for the purpose of dealing with the new situations. The association between the developments 
in biology and agronomy and R. A. Fisher's work, between the emergence of mass production of stan- 
dardized units and quality control methods, and between the pressure of war needs and the emergence of 
sequential analysis, are all cases in point. These are all experimental and quasi-experimental situations, 
in a different class from the situations described by historical time series. 
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ance that underlie the current theory of frequency distributions and 
sampling, assumptions that imply sufficient control to eliminate sys- 
tematic major factors. These assumptions may be used toward the 
end of an analysis of time series, not at the beginning; just as the 
statistical theory of normal and near-normal variance is used at the 
end of the controlled experiment, not at its beginning. 

The construction of hypotheses concerning systematic change in 
historical time series may properly be viewed as the task of the econo- 
mist, sociologist, etc., rather than of the statistical theorist. But who- 
ever uses statistical data in the search for intellectual order must recog- 
nize the proper relation between the construction of hypotheses and 
the application of tests now available in statistical theory; must, 
above all, recognize the gap between the structure of historical time 
series and post-controlled-experiment situations underlying statistical 
theory. He must resist the temptation to identify prematurely calendar 
time, as it is used in time series; viz., as a means of marking dates for 
possible use in associating a given phenomenon with others—preceding, 
contemporary, or succeeding; with experimental time; viz., time that 
is counted in seconds, hours, etc., elapsed since a change was delib- 
erately introduced to bring about an expected effect. He must resist 
the temptation to apply prematurely to historical series the elegant 
apparatus developed for an entirely different situation; and to pro- 
ceed by modifying theory merely to make a general allowance for de- 
parture from what might be called post-controlled-experiment vari- 
ance. For it is not the general magnitude of the departures from this 
essentially different model that is important or fruitful; the crux is in 
the positive identification of the systematic elements in the situation, 
and the need is for a study of their specific attributes. 


Vv 


To complete the picture, attention should be paid to the pressures 
that play upon the statistical analyst of social aggregates. These are 
intangible, not easily describable in unequivocal terms. Their im- 
portance, however, warrants discussion, even at the danger of miscal- 
culating their true weights. We discuss these pressures briefly under 
three heads: economic, moral, and intellectual. 

By economic pressures we mean the ways in which allocation of re- 
sources by society shapes the statistical analysis of social problems. 
Thorough use of statistics is laborious and costly, and can scarcely be 
undertaken as a hobby. But in modern society conditions of gainful 
employment that permit statistical research are generally geared to 
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current problems. Consider the situation in this country, where statis- 
tical study of society is perhaps further advanced than elsewhere. A 
large proportion of statisticians concerned with social aggregates are 
in the employ of governmental agencies. Their natural preoccupation 
with current problems, especially governmental policy, militates 
against gaining a long-time perspective, a world-wide horizon, and 
leisure for well founded conclusions. Another large group are con- 
nected with private and semi-public enterprises. Their attention too is 
naturally channeled into current problems, especially those of group 
or private interests. A much smaller group are attached to research 
institutions that are public but not governmental. Even here there is 
compulsion to deal with problems that are in the public interest, since 
the economic position of these institutions depends upon current pub- 
lic judgment of their usefulness. Finally, at the universities, which 
should be the sponsors of basic research, there is no tradition of such 
sponsorship (as distinct from teaching) in the social sciences and too 
much dependence upon outside financing—which again brings eco- 
nomic pressure for concentration upon current problems and for ‘re- 
sults.’ I do not mean that economic pressures are so monolithic that 
no opportunity exists for long range, multi-geared research, in which 
the investigator is allowed to build his conclusions slowly, on sound 
bases. But economic pressures favor research upon current problems, 
within a narrow space-time framework, and with a tendency toward a 
rapid formulation of ready and presumably useful conclusions. 

Even were there no economic pressures, research workers would be 
under moral pressure to concentrate upon current problems and to 
draw conclusions for use in policy making. It is as difficult for a social 
scientist to be indifferent to the interests of his subject, human soci- 
ety, as it is easy for an entomologist to disregard the interests of his 
ants or for the physicist to be blissfully oblivious to the hopes and 
aspirations of the atoms he bombards. And even if some students of 
society are able to deal with the longer range issues and eschew the 
problems of the country and of the day, their proportion is small be- 
cause the social sciences are bound to attract minds with a lively 
interest in current problems. 

The intellectual pressures are somewhat more subtle but perhaps 
stronger because they bear directly upon analysis concerned with the 
basic purpose of scientific study—to establish valid generalizations. 
These pressures follow two channels. The first is a consequence of the 
incurable urge of the human mind to seek system in the complex muta- 
tions of observable reality, to impose order on chaos. It is not easy to 
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resist this urge by dwelling upon the scarcity and unreliability of ob- 
servations or the short span of experience they cover. Even if not con- 
cerned with practical uses of results or with the problems of the day, 
the scholar will still be inevitably concerned with the relevance of his 
observations to some consistent and broad framework of knowledge, 
to some system with a validity broader than the raw empirical data 
themselves possess. It is, therefore, understandable that limitations of 
data do not prevent attempts at theorizing and that this form of in- 
tellectual pressure produces a spate of theories, models, and generali- 
zations whose claims to validity go far beyond their empirical bases. 
How else explain the variety of hypotheses and conjectures in eco- 
nomics, viewed dispassionately and appraised against the empirical 
data used by their authors? The main factor is the inveterate urge of 
the human mind to generalize. 

The other channel of pressure operates even when one resists the 
temptation to establish intellectual order with inadequate data. If 
further development of empirical study, further additions to our stock 
of reliable and meaningful data, depend upon how we use the already 
available information; if in the cumulative process of growth of a 
science failure to analyze existing data may mean perpetuation of 
errors or scarcity of supply in the future, a strong motive exists for 
squeezing the last ounce of meaning out of the information at hand. 
No matter how careful and circumspect one may be in such an attempt, 
there is pressure in the direction of stretching data that may not be 
adequate to cover wider issues. 

One must not overlook the tremendous value of these intellectual 
urges and temptations—for advance in our knowledge has largely 
been powered by them. And it is equally easy to recognize that all 
scientific models and hypotheses must transcend the limits of the data 
that underlie them, must extend claims to the unknown. But many of 
such theory- or hypothesis-building attempts have fallen short of using 
the data that were already available at the time; and, most important, 
few have been tested by information other than that used in their 
derivation, or have ever been presented in a form in which effective 
tests could be made by others. In point of fact, the description of the 
empirical frame of reference of many hypotheses, generalizations, in- 
ferences, etc., in the study of society—even those claiming statistical 
foundation—is loose and ambiguous. 

Consequently, statisticians concerned with the study of society are 
confronted with another major task—the critical evaluation of the 
validity of hypotheses and theories claiming some degree of generality. 
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It would obviously be impossible and sinfully wasteful to disregard 
these products of the human mind completely, but it is equally diffi- 
cult to accept them at face value. The student of social aggregates 
thus works in a field cluttered up with the debris of intellectual en- 
deavor, and to salvage a modicum of what is useful is far from easy. 
One hardly need labor the point that evolving efficient tests and de- 
vices for assigning limits of inference to a given body of quantitative 
evidence is a major and customary task for the discipline of statistics. 


VI 


The above sketch of the conditions of statistical study of society, 
although oversimplified, is realistic enough to permit conclusions con- 
cerning the directions of work in the statistical discipline in the field. 
It is for the sake of these conclusions that we have treated the wide 
field so superficially. 

Their general tenor must have already become clear from the em- 
phasis in our earlier remarks. The discussion of the supply of statistical 
data suggests an urgent need to plan the supply for the future and to 
recover data that lie in obscurity owing to disuse. Extension of his 
interest and vision to encompass a longer period and a wider area, 
and to penetrate more deeply through attention to smaller time and 
space units, is the sole way a statistician can enrich the effective sup- 
ply of data. For it is only his work that will breathe life into the dead 
and dusty volumes of whatever statistics are available; force the at- 
tention of other students to the relevance of these data to effective 
knowledge; and provide the basis for a more systematic and rounded 
collection in the future. In stressing the social-act origin of statistical 
data on society, I certainly do not mean to imply that the supply 
cannot be influenced by individual workers. But the influence cannot 
be effective unless the horizons of statistical research, of the work of 
the statistical scholars themselves are broadened. Such an extension 
is possible with the data at hand. We need not wait for further accre- 
tions of data. My impression is that research is lagging behind the 
accumulation of data; that there is still a wide area in which, despite 
difficulties, fruitful work need not be estopped by lack of information; 
and that, accordingly, large potentialities of enriching the supply of 
data for the future lie in extending time and space boundaries and in 
refining units in research with data already available. 

Considerations of the reliability of the data suggest an even more 
obvious and specific task. We need most a more explicit description 
of how the data are or were obtained and a more specific appraisal of 
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the possible errors involved. As a user, rather than collector, I have 
been increasingly impressed in recent years with the difficulty of 
getting accurate information on the origin of various time series and of 
attaching magnitudes to the errors that characterize them. Indeed, 
as a rule, collectors and publishers of primary data do not deem it 
their obligation to accompany a series by a detailed description of 
how it was obtained; and users also, for the most part, tend to accept 
a series, particularly one issued by a governmental agency, at its face 
value without inquiring into its reliability. If this impression is cor- 
rect, there is surely room for much additional work. It may legiti- 
mately be urged that compilers and publishers of series give full details 
on methods of collection, compilation, classification, and adjustment; 
that various compendia of basic series supply descriptions of their 
origin as an indispensable part of the information; that users exercise 
their right to be informed about the derivation of the series offered 
them; and that authors of textbooks on statistics become cognizant of 
the problem and cease confining their attention to tools of analysis 
while forgetting the elementary question of the character of the pri- 
mary and derived data. All this opens a wide and still relatively 
uncultivated field. 

The problems of theory suggested by our discussion of the nature 
of changes in social statistics are, despite strenuous efforts, still rather 
intractable. It is clear that one must seriously question a great variety 
of statistical analyses that consciously or unconsciously are based 
upon too easy an assumption of post-experimental types of variance. 
Also, as already suggested, attempts to treat the cardinal difference 
between historical series and experimental types of variance by auto- 
correlation adjustment devices, variate difference methods, conversion 
to ranks, etc., are highly suspect, for the very reason that they drown 
the little specific knowledge we have of patterns of change in historical 
series in the anonymity of formal deviations from normal or near- 
normal variance. But even if this judgment is accepted, and I realize 
that many of our members will dissent, it cannot be claimed that we 
have effective models. We are still groping toward an adequate theory 
of historical series; a theory that may well differ from one substantive 
area to another. In such a situation we can only urge more analytical 
and empirical work; more serious attention by handlers of mathe- 
matical tools to the specific conditions of the model building task; 
and more interchange of knowledge among the many substantive fields, 
not only in the social sciences, in which the use of uncontrolled his- 
torical series is prevalent. One can hardly be dogmatic here, but a 
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great deal is still to be done. We must extend the area of work that 
avoids the dangers of, on the one hand, limitations of elaborate mathe- 
matical models that rest upon oversimplified assumptions and, on the 
other, the intuitive type of approach—whose results are not subject 
to check and do not cumulate into a system of tested knowledge. 

This continued search for a positive theory of analysis of historical 
series should encourage rather than preclude more attention to the 
fourth major task: a critical appraisal of theories, generalizations, 
inferences. An obvious step in this direction has been suggested: an 
author’s obligation to state the space-time limits of validity he claims 
for his generalization. It is not uncommon to find statistical studies 
in which the consumption, or savings, or other ‘function’ is established 
—with an indication of the series used for the purpose, yet with vague 
claims of a more general validity for the function and no specification 
of the space-time limits of these claims. Obviously more explicitness in 
stating the limits of validity would be a major step forward—partly 
toward facilitating testing of the results, and even more toward pre- 
venting generalizations from too inadequate samples. It is a simple 
step, yet not easy; but then there is no inherent connection in intel- 
lectual performance between simplicity and ease. And beyond such a 
step, there is no need to urge the value of vigorous criticism of the out- 
put of statistical analysis, more vigorous than in the past: little prog- 
ress can be expected in a field such as we are dealing with unless we 
cull out untenable results. 


VII 


Two comments are added to place the discussion in a somewhat 
broader framework. First, while we have considered statistical research 
in the study of human society, many of the questions and suggestions 
apply to other fields in which uncontrolled historical series have to be 
used. Unfamiliarity with these other fields prevents me from urging 
this point strongly. But problems connected with the supply and re- 
liability of data and with the nature of the universe reflected in them 
in fields such as forestry, botany, zoology, biology, geology, meteorol- 
ogy, and even astronomy, seem to me not unlike those in the statistical 
study of human society. Naturally, the parallelism is far from exact. 
There should, however, be wide possibility of learning from other dis- 
ciplines, not only as among the various fields of social study, but also 
as among all students of human society, on the one hand, and in his- 
torically bound natural science disciplines, on the other. 

Second, is the subject under discussion really statistics or is it largely 
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the substantive disciplines concerned with the study of society? It 
may be argued that what I have discussed is not statistics, if by the 
latter is meant the theory and practice of analyzing records of variance 
of a restricted type; that I have been dealing really with economics, 
sociology, etc., albeit of a quantitative type. My reaction would be 
to avoid a debate on semantics, even though statistics long meant a 
quantitative study of society, as the etymology of the very name re- 
veals.* What we call it does not matter; and if clarity may be attained 
thereby, let us christen the intellectual discipline I have been dis- 
cussing with a new name—say, Historical Arithmetic, on the pattern 
of Political Arithmetic, one of its ancestors. So long as we understand 
by Historical Arithmetic the study and analysis that deals with quan- 
titative records of historically bound processes, for the eventual pur- 
pose of establishing testable generalizations; so long as we agree that 
the intellectual problems arising in such an effort are common to a 
wide variety of substantive fields of study, we shall be dealing with a 
body of methods of divers application and a common core that justi- 
fies considering it a major field of scientific methodology. 

However such semantic questions are answered, it seems clear that 
in this field of scientific methodology, our Association, its present 
membership and those who will join our ranks in years to come, have 
an important task to perform. Under the shock of the catastrophic 
events of recent decades, belief in the possibility and usefulness of 
scientific study of human society has grown perceptibly weaker. There 
are, and will be, many to doubt that the search for objectively found 
patterns of stability and change is likely to yield significant knowledge 
in the field; and who will turn to other ways in which men may re- 
concile themselves to the apparently capricious turbulence of human 
history. We can easily understand and sympathize with such doubts. 
But it is difficult for me to see how there can be any guide for scien- 
tific work and development other than a belief in the existence of some 
order in the seemingly chaotic jumble of history; in the demonstrabil- 
ity of such order in empirical terms; and in the ultimate social useful- 
ness of the resulting body of tested theory. These three basic beliefs 
warrant examination of conditions of statistical research, such as has 
been sketched above, in the spirit of setting a task for the future rather 
than of apology for the past and for failure to go on; in terms of hope 
rather than of despair. 





§ See the discussion and references in the article on Statistics, Encyclopedia of the Social Sciences, 
Vol. XIV. 





INDEX OF THE PHYSICAL VOLUME PRODUCTION 
OF MINERALS, 1880-1948 


Y. S. Leone 
Division of Statistical Standards, Bureau of the Budget 


There is need for a reliable and authoritative index measur- 
ing the production of minerals in the United States in terms 
of physical quantity from the time when each of the mineral 
industries became an important factor in the national econ- 
omy to the present. It is our ‘hope that the annual index of 
mineral production for the period 1880 to 1948, together with 
the indexes for important mineral groups, as described here, 
will fill this need.’ Sixty-three annual physical volume series 
on mineral production have been selected for inclusion in the 
index. 


T HAS BEEN our objective in designing an index measuring the pro- 

duction of minerals in the United States, that: (1) the over-all index 
and the group indexes should portray accurately the long-time growth 
and the annual charges of the output of minerals as a whole and of 
important segments of the mineral industries from the advent of the 
industrial revolution in this country to the present high level of indus- 
trial production, and (2) they should serve as benchmarks with which 
less comprehensive current production indexes of minerals and of 
groups of minerals may be compared and corrected from time to time. 

Sixty-three annual physical volume series on mineral production 
have been selected for inclusion in the index representing 98 per cent 
of the value of all minerals produced in the United States in the base 
period 1935-39.? Some of the component series are of minor importance, 





1 The construction of the index was begun by the U. S. Bureau of Mines in 1937 under the super- 
vision of Dr. O. E. Kiessling, then chief economist of that Bureau. At the request of Dr. John W. Finch, 
then Director of the Bureau of Mines and Dr. Kiessling, the writer was assigned by the Central Statis- 
tical Board, of which he was a staff member, to advise that Bureau in constructing this index. Because 
of the resignation of Mr. F. J. McCarty, the computer of the index, and the absence on leave and 
finally the resignation of Dr. Kiessling, the construction of the index was suspended by the Bureau. 
The writer, however, has continued the computation from time to time with the aid of staff members 
of the Central Statistical Board and of the Board's successor, the Division of Statistical Standards, 
Bureau of the Budget, who happened to have some spare time. Grateful acknowledgment is made to 
Mr. F. J. McCarty, formerly of the Bureau of Mines, Mrs. Charlotte Jett and Mr. Elliott B. Woolley, 
formerly of the Central Statistical Board, for their laborious computations, and especially to Dr. O. E. 
Kiessling, now of the U. 8. Tariff Commission for his aid and advice in planning the index. Because of 
the wide interest in such an index as evidenced by the many requests from economists and statisticians 
for the index, and because the Bureau of Mines will resume computation of the measurement as one 
of its regular functions, it is deemed advisable to publish a brief description of the method used in con- 
structing the index, together with the index numbers for all minerals and important groups of minerals. 

2 Except as indicated these data are compiled from the following sources: 1880 to 1931, Mineral 
Resources of the United States, published annually by the U.S. Geological Survey, 1882 to 1923, and by 
the U.S. Bureau of Mines 1924 to 1931; 1832 to date, Minerals Yearbook, published by the U.S. Bureau 
of Mines. The writer is indebted to Miss Martha B. Clark of the Bureau of Mines (retired) for checking 
the figures and for supplying unpublished revised figures for some of the series. 
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that is, relatively small in terms of value, in comparison with such 
series as petroleum and coal which because of their much greater eco- 
nomic value, predominate the index. Although these minor series have 
but a small effect on the total index, they are nevertheless important 
as components for the group indexes which it is desirable to have as 
accurate and independent measurements. For in some of the group 
indexes, which are largely composed of series of small economic value, 
these series have in fact become major components. Moreover, to in- 
sure comprehensiveness in scope it is necessary to include some of these 
series to give representativeness to certain minerals in the group in- 
dexes. It is for these reasons that these series have been included even 
though they have but a minor effect on the trend or the movements of 
the total index. 

The sixty-three series are classified into the following groups and 
sub-groups for each of which an index is computed: 


I. Fuels 
II. Metals 
A. Ferrous 
B. Nonferrous 
1. Base 
2. Monetary 
3. Other 
III. Non-Metals 
A. Building Materials 
B. Chemicals 
C. Other 


Annual figures for all 63 series are available only from 1923 to date.* 
From 1922 and extending backward to 1880, data for 28 series are lack- 
ing. Of these 28 series, 16 represent the entrances of new minerals into 
production after 1880, for which it is assumed that these minerals were 
not extracted prior to the date on which figures were first reported. 
The remaining 12 represent minera‘s that were in production prior to 
the date for which data on output were first collected. The initial dates 
of the series in the index is given in the Appendix, Table I. 

No allowance need be made in the index for those series representing 
entrances of new minerals into production. However, for those other 
series representing minerals already in production for which data on 





* For a list of the component series, together with the initial date on which each series is included 
in the index, and the percentage of the weight assigned to each series to the total weight assigned to all 
series, see Appendix, Table 1. 
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output are not available for a part of the period covered, an adjustment 
in the index must be made. 

Index makers generally assume that the absent segments of com- 
ponent series vary with the present component series in the index as a 
whole in trend and cyclical movements, and therefore make allowance 
in the weights to permit the present series in the index to represent the 
missing segments of the series. We have made such an adjustment for 
the following seven of the twelve series on which data were missing for 
the earlier years: cadmium compounds, natural magnesium salts, cal- 
cium-magnesium chloride, tripoli, greensand, high-grade clay and gar- 
net—all of which have only a negligible weight in the index. The series 
on mine production of zinc which began in 1907 could have been 
treated in the same way as the other seven series. But since it is an 
important series and a comparable series on the production of domestic 
slab zinc is available back to 1880, it is extended back to 1880 by 
correlating mine production and slab zinc production for the period 
1907 to 1939. 

The remaining four series—common clay, sand and gravel, lime, and 
stone—differ so widely in trend from the other component series that 
go all the way back to 1880 and are important components in the early 
part of the index that their absence would greatly distort the trend of 
the index, particularly the trends of the building material and non- 
metallic group indexes. As a matter of fact long before production data 
were first reported on these products (clay beginning in 1898, and the 
others in the early part of 1900), even back to 1880, these four minerals 
had already attained a high level of output so that their lines of growth 
rose but slowly in comparison with the other minerals whose output 
was just beginning, but rising rapidly from a very low level. Because of 
these differences in their lines of growth, adjustment for the missing 
segments of component series by the conventional method of permit- 
ting the present components to represent the absent segments cannot 
be used. Therefore, if the index is to be extended back to 1880 the 
missing segments of these four series must be estimated. 

There is a fairly good correlation between the common clay series 
which begins in 1898 and a building construction series. This is not 
surprising in view of the fact that clay is the mineral used in the manu- 
facture of bricks, tiles, and other building materials. The building con- 
struction series used is that of John R. Riggleman on the value of 
building permits issued in a varying number of cities for the period 
1875 to 1940 adjusted for the changing number of cities and deflated 
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by an index of construction costs.‘ The clay production series is ex- 
tended back to 1880 by correlating the common clay series with the 
building construction series. 

The greater part of the output of sand and gravel is used as aggre- 
gates with such bonding materials as cement, lime and gypsum in 
building construction and road paving. Comparison of the series on the 
quantity production of sand and gravel which begins in 1905 with the 
series on the value of the output of cement, lime and gypsum deflated 
by an index of building material prices indicates that there is a close 
agreement between them. Accordingly the missing segment of the series 
on sand and gravel from 1880 to 1904 is estimated on basis of a correla- 
tion analysis of the two series. 

The absent segment of the series on the quantity of lime produced 
from 1880 to 1903 is estimated on the basis of a series on the value of 
production of lime deflated by an index of building material prices. 
Similarly the series on building stones which begins in 1916 is extended 
back to 1880 by means of a series on the value of the production of 
building stones, deflated by an index of building material prices. 

The effect of including the estimated series as components is to raise 
the level of the building material group index by 40 points, the non- 
metallic group index by 20 and the mineral index by 2 in the 80’s, 90’s 
and early 1900’s. Thus, for example, with 1935-39 as the base and with 
the estimated series as components, the index numbers in 1897 for 
building materials are 50, for non-metals 27, and for all minerals 25; 
without the estimated series as components, the index numbers are 10, 
9, and 23 respectively. 

The aggregative method is used to combine the 63 series into com- 
posite index numbers.’ Apart from its other merits, this method per- 
mits a shift to a new base without recomputing the entire index. As 
the index is designed as a continuing measurement of mineral produc- 
tion in this country for years to come, and as it would therefore necessi- 
tate from time to time a shift to more recent bases in line with other 
economic measurements, this advantage becomes an important con- 
sideration in selecting the formula. Already we have experienced one 





4 The data on building construction were kindly supplied by Dr. Riggleman. Fora descriptionof 
the series see John R. Riggleman and Ira N. Frisbee, Business Statistics, McGraw-Hill Book Co., N.Y. 
1938, pp. 720-22. 
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base period; Qs, the quantity of the mineral in the base period; and Q, the quantity of mineral in any 
given year. 


5 The formula is 
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such shift—that from the 1923-25 base on which, in common with the 
then existing production measurements, the index was originally com- 
puted, to the present 1935-39. Undoubtedly there will be a shift in the 
near future to a more recent postwar base, perhaps to the 1945-49 or 
some period within this interval. 

The weights employed in combining the component series into the 
total index and the group indexes are based upon figures on the value 
of each mineral at the mines. The value data used are those compiled 
by the Bureau of Mines, which for most of the minerals represent the 
annual value of production at the mines. For a few of the metals— 
copper, lead, zinc, gold and silver—the value of production figures rep- 
resents the value at the market place, and fora few of the non-metallic 
mineral products—cement, lime and building stones—the data repre- 
sent value of the finished products rather than value of the raw materi- 
als at the quarries. In these instances, it is necessary to estimate the 
value of production of the mineral at the mines or quarries on the basis 
of the decennial data on value of production at mines or quarries com- 
piled by the Bureau of Census.® A ratio of the Census figures on value 
per unit of product to the Bureau of Mines figures on value per unit of 
product of a mineral for a given census year is computed. The estimated 
value of production per pound of copper for the weighting period 1935- 
39, for example, is obtained by applying the Census-Bureau of Mines 
ratio for 1939 to the Bureau of Mines figures on value per pound of 
copper for the period 1935-39. The estimated value figures for the 1923- 
25 and the earlier weighting periods are derived on the basis of Census- 
Bureau of Mines ratios for the census year 1929. 

The weight factor for a component series is computed by dividing 
the value of production for a given weighting period by the quantity of 
production of the mineral for the same period. Thus the weight factor 
is simply the average unit value of that mineral for the given weighting 
period. 

An index of production based on weights derived from value of pro- 
duction figures of a given period will register accurately the physical 
volume production for a series of years only on the assumption that 
there is throughout the period covered by the index no shift in the 
demand for the various minerals. Our examination of the data shows 
that there were marked shifts in the demand for certain minerals, or in 
other words, significant changes in the relative importance of these 





6 See for example, Sixteenth Census of the United States, Mineral Industries 1939, Vol. 1. 
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TABLE I 


SHIFTS IN RELATIVE IMPORTANCE OF MINERAL GROUPS AS INDICATED BY 
PERCENTAGE CHANGES IN WEIGHTS BY PERIODS: 1889-91, 1909-13, 
1923-25 AND 1935-39 








Per Cent of Per Cent of Per Cent of Per Cent of 


Mineral Group Total 1889-91 | Total 1909-13 | Total 1923-25 | Total 1935-39 
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minerals during the past seven decades. Although value data are availa- 
ble for every year covered it has been found upon analysis that only 
those for certain periods show sufficient stability, that is, freedom from 
violent fluctuations, to be considered as best suitable for weighting 
purposes. These periods are 1889-91, 1909-13, 1923-25 and 1935-39. 
For each of these periods a set of weight factors is derived. Four sets of 
overlapping indexes are constructed with the four sets of weights: the 
1889-91 weights are used to compute the index and group indexes for 
the period 1880-1903, the 1909-13 weights for the period 1897-1920, 
the 1923-25 weights for the period 1917-39, and the 1935-39 for the 
period 1929-48. 

The extent of the shift in relative importance of the mineral groups 
with the passage of time may be observed from Table I showing the 
percentage of the value of production of each group to total minerals 
for the weight periods, 1889-91, 1909-13, 1923-25 and 1935-39.7 It 
will be noted, for example, that the fuel group increased in relative 
importance, while the metal and nonmetal groups were declining. Indi- 
vidual minerals too underwent similar major shifts; for example, among 
the fuel groups, crude petroleum gained in relative importance at the 
expense of coal. 

The four overlapping segments of the indexes of minerals and of the 





7 For the period 1935-39, the percentage of the value of each mineral to the total of all minerals 
included in the index is presented in Table I of the Appendix. 
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CHART I 


COMPARISON OF SEGMENTS OF INDEXES OF THE PRODUCTION OF MINERALS, 
METALS, NONMETALS AND FUELS COMPUTED WITH 1889-91, 1909-13, 
1923-25 AND 1935-39 WEIGHTS 
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three major groups—metals, nonmetals and fuels—computed with four 
different sets of weights, are presented in Chart I. It will be observed 
that the overlapping segments of each index differ from each other as 
to level, though not as to direction, and that the differences, that is, the 
gaps are greater between the earlier segments than the more recent 
ones—between the 1880-1903 and the 1897-19290 or the 1897-1920 and 
the 1917-39 than between the 1917-39 and the 1929-48. Note also that 
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CHART II 


INDEXES OF THE PHYSICAL VOLUME PRODUCTION OF MINERALS, 
METALLIC MINERALS, NONMETALLIC MINERALS AND 
FUEL MINERALS, 1935-39 = 100 


some of the gaps between the overlap of two segments tend to widen 
toward both ends of the overlap. These differences also occur in the 
overlapping segments of subgroups of the indexes, which are not graph- 
ically shown here, and in some instances they are more pronounced. 
The discrepancies in level and the fact that the gaps between over- 
lapping segments tend to spread further apart towards the ends of the 
overlap, suggest that where an index extends over a number of decades, 
it is better to break it up into segments and weigh each segment with a 
separate set of weights derived from data that best represent the rela- 
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tive importance of one component series to another within each seg- 
ment rather than to apply a single set of weights to the entire index. 
The remarkable fact, however, is not that there are differences in the 
overlaps of the segments but that the gaps between the overlaps are 
not further apart than they are, considering the differences in the 
weights used. 

The separate segments of the index and of each group and subgroup 
indexes must be spliced together to form continuous series of index 


mocx NUMBER INOEX NUMBER 
250 250 









































MONETARY METALS a, 


















































a LELILILL LLL ELE Lt ° 
1680 1690 1900 1910 1920 1930 i940 1950 





CHART III 


INDEXES OF THE PHYSICAL VOLUME PRODUCTION OF FERROUS METALS, 
NONFERROUS METALS, BASE METALS AND MONETARY METALS, 
1935-39 = 100 


numbers spanning the entire period from 1880 to the latest year. Two 
methods of splicing are employed. 

Where the gap between the overlapping segments is narrow, that is 
within one point, the splicing is accomplished by taking a simple aver- 
age of each of the pairs of overlapping index numbers for three ad- 
jacent years. 
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Where the gaps are further apart, the method adopted is that of a 
progressively weighted geometric mean of each pair of overlapping in- 
dex numbers of three or five adjacent years centered at a splicing origin 
in the middle of this interval. At the splicing origin, equal weight is 
given to the pair of index numbers to be spliced. For the year preceding 
this origin, the index number for the earlier of the two overlapping 
segments is given a weight of two, while that for the later segment a 
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CHART IV 
INDEXES OF THE PHYSICAL VOLUME PRODUCTION OF CONSTRUCTION 


MINERALS, CHEMICAL MINERALS AND OTHER NONMETALLIC 
MINERALS, 1935-39 = 100 

weight of one; and for the year following this origin the index number 
for the later segment is given a weight of two, while that of the earlier 
segment a weight of one. Similarly in the case of a five-year splicing 
interval, the index number for the earlier segment for the second year 
preceding the origin is weighted three times as much as that for the 
later segment; while that for the later segment for the second year 
following the origin is weighted three times as much as that for the 
earlier segment. Take, for example, the splicing of the 1917-39 segment 
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to the 1929-48 segment. The splicing interval chosen is the three years 
1930-32, or the five years, 1929-1933, with the origin at 1931, the center 
of both intervals. For 1931 the index number is given a weight of one 
for both the 1917-39 and the 1929-48 segments; for the year 1930 the 
index number for the 1917-39 segment is given a weight of two, while 
that for the 1929-48 segment a weight of one; for the year 1932 the 
index number for the 1917-39 segment is weighted one, while that for 
the 1929-48 segment is weighted two. 

This weighting scheme is based on the assumption that the earlier 
segment, being computed with weights which better represent the rela- 
tive importance of one component series to another within that seg- 
ment, measures more accurately the physical volume output of the 
mineral industries for years prior to the splicing origin than the later 
segment; while, on the other hand, for similar reason, the later segment 
gages more precisely the productive activities for years following the 
splicing origin. This assumption is borne out by the observation previ- 
ously made that the gap between two overlapping segments tends to 
spread increasingly apart toward the terminals of the overlap. 

The splicing interval for the 1880-1903 and the 1897-1920 segments 
for the total index and group and subgroup indexes is the three years 
1899, 1900 and 1901 with the origin at 1900, except for the nonmetal 
group, for which the five years, 1896-1901 are chosen, with the origin 
at 1898. The splicing interval for the 1897-1920 and the 1917-39 seg- 
ments for the index and groups and subgroups is the three years 1918— 
20 with the origin at 1919. Finally the splicing interval for the 1917-39 
and 1929-48 segments is the three years 1930-32, with the origin at 
1931 except for the “other nonferrous” subgroup, for which the five 
years 1929-33, with origin at 1931, are selected. 

The period 1935-39 is chosen as the base. The choice is based on the 
fact that this period has been widely adopted by federal agencies as 
the standard base of comparison for their important economic measure- 
ments.* However, as we have already pointed out, this index is so con- 
structed that it together with its groups and subgroups can readily be 
converted to a more recent base without laborious recomputations. 

The index numbers of the physical volume production of all minerals 
and of groups and subgroups of minerals are presented in the Appendix, 
Table II, and also illustrated graphically on Charts II, III and IV. 





8 This base was recommended by the Central Statistical Board, the predecessor of the Division of 
Statistical Standards, Bureau of the Budget, as the most suitable recent period for adoption as the 
standard base. 
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APPENDIX 
TABLE I 


SERIES INCLUDED IN THE INDEX OF MINERAL PRODUCTION 








Series 


Percentage of Total 
in Index in 1935-39 


Initial Date of 
Series in Index 





Minerals—Total 
Metals 
Ferrous 
Iron Ore 
Manganese (35% or more Mn) 
Manganese (5 to 35% Mn) 
Molybdenum 
Tungsten 
Nonferrous 
Base 
Copper 
Lead 
Zinc 
Monetary 
Gold 
Silver 
Other 
Bauxite 
Cadmium 
Magnesium 
Mercury 
Platinum 
Nonmetals 
Construction 
Asphalt 
Cement 
Common Clay 
Gypsum 
Lime 
Magnesite 
Sand and Gravel 
Stones 
Basalt 
Granite 
Limestone 
Marble 
Sandstone 
Slate 
Miscellaneous 
Chemicals 
Arsenious Oxide 
Barite 
Borates 
Bromine 
Calcium-magnesium Chloride 
Fluorspar 





100.00 
11.87 
4.11 
3.48 
02 

-06 
-46 


nen 


ZSSskezresassas 








* 1880-1905 included in Iron Ore. 
t 1880-1906 estimated. 

t 1880-88 estimated. 

§ 1880-1903 estimated. 

{| 1880-1904 estimated. 


J Actual data on various kinds of stones beginning in 1916; estimates for stones as a whole 1880- 


1915. 
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TABLE I (Continued) 








Series 


Percentage of Total 
in Index in 1935-39 


Initial Date of 
Series in Index 





Magnesium Compounds 
Phosphate Rock 
Potash 
Pyrites 
Sodium Chloride 
Sodium Carhonates 
Sodium Sulfates 
Sulphur 

Other 
Garnet 
Grindstone 
Pulpstone 
Pumice and Pumicite 
Silica Sand and Sandstone 
Tripoli 
Quarts 
High Grade Clay 
Fuller’s Earth 
Greensand 
Feldspar 
Mica Sheet 
Mica Scrap 
Ground Tale 

Fuels 
Anthracite 
Bituminous Coal 
Natural Gas 
Natural Gasoline 
Petroleum 
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ON THE USE OF THE COUNTY AS THE PRIMARY 
SAMPLING UNIT FOR STATE ESTIMATES* 


Littran H. Mapow 


Our purpose is to consider the efficiency of the county as the 
primary sampling unit for making state estimates of some so- 
cio-economic and agricultural characteristics of the farm popu- 
lation in North Carolina. The main emphasis is on the general 
purpose type of survey for the State as a whole—one in which 
both socio-economic and agricultural characteristics are to be 
estimated; the study is not directed at obtaining ‘best’ esti- 
mates for specialized items. Ca!culations herein presented indi- 
cate that the county is not a very efficient primary sampling 
unit for state estimates. It is expected that future investigation 
will show that the minor civil division, or some better defined 
statistical area of comparable size, will produce better results. 
Variances and relative costs need to be calculated, however, 
before it is possible to evaluate the relative gains of the smaller 
over the larger unit. 


INTRODUCTION 


ye SAMPLING plan used perhaps most widely for state agricultural 


estimates is a two-stage sample in which the county is the primary 
sampling unit and a small cluster of farms is the secondary sampling 
unit. The county is usually used as the primary sampling unit because 
data are readily available for the county, and because the county is a 
convenient administrative unit. 

With regard to the use of the county as the primary sampling unit, 

the main questions investigated are the following: 

1. How large are the gains due to stratification of the primary 
sampling units, and to what extent does it pay to stratify? The 
study shows that both 10 and 20 strata produce appreciable gains 
in efficiency over the unstratified case, and that stratification of 
the 10 strata into 20 also produces gains attributable to stratifica- 
tion but to a lesser degree. Estimates of gains with 40 strata indi- 
cate a continuing decline in the gains due to additional stratifica- 
tion. 

How many of the 100 counties in North Carolina are necessary 
to give reliable estimates, if only the simple unbiased estimation 
equation is used? Studies using the unbiased estimation equation 





* Prepared while the author was Resident Collaborator for the Bureau of Agricultural Economics 
at the Institute of Statistics, North Carolina State College, and presented at a seminar at the Institute 
of Statistics of the University of North Carolina, December 1948. 
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based on 20 strata indicate that 20 counties are not sufficient to 
estimate most items with an accuracy of a 5 per cent coefficient of 
variation (c.v.), even if the counties selected are perfectly repre- 
sented within. In fact, it is judged that even samples of 40 
counties often will not yield estimates within a 5 per cent c.v., 
if the simple unbiased estimate is used; and if greater accuracy is 
wanted, say a 2 per cent c.v., then for most items not even the 
40 county estimate will be satisfactory. 

. Since administrative and cost considerations generally limit most 
investigations in North Carolina to about 20 counties, how can 
estimates based on 20 counties be improved? One way is to im- 
prove the estimation procedure. The three procedures we consider 


a. ratio to estimated number of farms at time of survey, 

b. ratio to same characteristic at a previous census, and 

c. regression estimate. 

The first procedure gives little improvement, but the second 
gives a great improvement, sufficient to make 20 counties ade- 
quate for most purposes; the third procedure shows slight gains 
over the second. The difficulty with these procedures is that the 
estimate is more laborious to calculate than the simple additive 
ones, and, in the case of (b) and (c), comparable data may not be 
available at an earlier census date. 

Another way to improve state estimates is to use area substrat- 
ification! or equivalent techniques, either with the county or a 
smaller area as the p.s.u. 

Still another technique would be to have a complete enumera- 
tion of large farms in addition to the sampling of the smaller ones. 

Although we have not studied the effects of these two methods 
of improving the estimates, we conjecture that they, as well as 
any others in which the county is used as the p.s.u., will not 
reduce the sampling variances sufficiently to make the county a 
desirable p.s.u. for state agricultural estimates in states such as 
North Carolina. 

It must be emphasized that throughout this paper results are pre- 
sented only for the between-county variances. Investigations are also 
in order for methods of reducing the within-county variance. Ordinarily, 
the between-county variance dominates the over-all variance of a two- 





1 Hansen, Morris H. and Hurwits, William N., “On the Theory of Sampling from Finite Popula- 
tions,” The Annals of Mathematical Statistics, Vol. XIV, No. 4, December, 1943. 
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stage sampling design of the kind described. Furthermore, the between- 
county variance will be the same for all estimation equations that yield 
unbiased estimates of the county totals, if the same method of selecting 
counties is used. 

One by-product of the study has been that the calculations show 
that the (c.v.)? for the between-county contribution based on 1940 
data come within 10 to 20 per cent of the (c.v.)? of the design actually 
used, for almost all items. The only items that showed a larger dis- 
crepancy were those considerably affected by the war situation, like 
work off farm. Therefore, even in such a period as 1940 to 1945, a 
(c.v.)? based on the earlier date gave satisfactory estimates of the 
(c.v.)? needed for planning a survey in 1945. Exceptions occurred only 
in items considerably affected by the disturbances of the intervening 
period. : 

MATERIALS 

The materials used in carrying through the study are: 

1. Type of farming areas in North Carolina, prepared by the North 
Carolina State College Department of Agricultural Economics, used 
as a basis for stratification in this study. 

2. County summary data from the U. 8. Censuses of Agriculture, 
for 1940 and 1945, used for 


(a) establishing strata of approximately equal size, as measured by 
the number of farms. 

(b) calculating between-county variances for unbiased, ratio and 
regression estimates. 


METHOD OF STRATIFICATION 


Two basic sets of strata are used in this study—one consisting of 10 
strata and the other of 20. Both are based on a stratification of the 
100 counties of the state made by the North Carolina Department of 
Agricultural Economics, classifying the counties according to type of 
soil and type of farming. This stratification produced 8 major strata, 
some of which were subdivided, making a total of 12 strata. Map No. 1 
shows this stratification. The basic 10 strata used in this study are 
obtained from these 12 by equalizing the total number of farms in each 
stratum. In this process, some of the original stratum lines are violated, 
but geographic contiguity is maintained in all cases. 

The 20 strata are then formed from the 10 by subdividing each of 
them into 2 parts in such a way that again there are approximately 
equal numbers of farms in each stratum and geographic contiguity 
remains. Map No. 2 shows how the 100 counties of North Carolina are 
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MAP NO. 1 
TYPES OF FARMING AREAS IN NORTH CAROLINA 
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allocated to the 20 strata. The 10 strata are also easily identified by 
combining the A and B strata for each Roman numeral. 

No claim is made that these strata are ‘best’ in any sense, and without 
any doubt, the basic strata can be improved, but it is doubted that the 
improvements will be marked in any fairly general purpose survey. 
Examination of the individual county contributions to the over-all 
variance shows that it is difficult to set up ‘best’ strata for several items 
at the same time. What appears to be good stratification for one item 
turns out to be poor for another. 


BASIC SAMPLING PLAN AND EQUATIONS USED 


In order to approximate the situation of an actual survey, namely, 
that in which basic county data are available for a base period a few 


MAP NO. 2 
BASIC STRATA 
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years prior to the date of a survey, we consider the problem of esti- 
mating the total of a characteristic, say number of tenants, for the 
farm population of North Carolina in 1945, using a sampling plan based 
on 1940 data. Census data for 1940 and 1945 are then used for calcula- 
tions of variances to evaluate the alternative designs considered. 

Although the calculated results in this paper are only for the be- 

tween-county variance, assuming that the counties selected for the 
sample are then completely enumerated, the formulae presented are 
for the complete two-stage design in which secondary sampling units, 
usually small clusters of farms, often the so-called master sample? seg- 
ments, are selected and enumerated within the counties selected for 
the sample. The design presented here is only one of many possible 
satisfactory designs, this one being given mainly because it is so popu- 
lar. . 

The basic sampling plan consists of the following steps. 

1. Stratify the counties of North Carolina into G strata. 

2. From each stratum, select one county with probability propor- 
tionate to size (p.p.s.),! where the measure of size is the 1940 
number of farms. 

. Select the same proportion, ¢, of farms from each stratum, which 
implies a determined constant expected number of farms from 
whichever county within the given stratum is selected for the 
sample. 

4, Subsample master sample segments or other secondary sampling 
units systematically from within the counties selected in step 3. 

The problem is to estimate x, the total of a characteristic in 1945. 

Thus, we want 
G Ha 


Ma; 
To estimate: x = > Zz. ; Lainy 


a=] i=l p=l 


where ai, is the total of the characteristic in the uth segment of the 
ith county of the ath stratum. Thus, there are G strata, the ath of 
which contains H, counties, the ith of which county contains M.; 
segments. It is to be noted that G, H., and M.,; are the same for 1940 
as for 1945. 

To estimate zx, we use 


Estimation equation: 





2 King, A. J. and Jessen, R. J., “The Master Sample of Agriculture,” Journal of the American Sta- 
tistical Association, March, 1945. 
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where ¢ is the over-all sampling rate and m,; is the number of segments 
in the sample from the ith county of the ath stratum; it is calculated 
by means of 


pa 
Na = s Mai; 


Pai 


where Pa=>_2%, pai=total number of farms in the ath stratum in 
1940. 

It is easy to show that 2’ is an unbiased estimate of z, and that its 
variance is given by 

Variance of estimate: 


He ° ° o = ° 
>| > Pai Mai Mai — Mai ou: 


a=] im1 Da F Mai —1 


+> [pes a ees - +.)'| 


a=) t=] Pa 


as 


[> (eau! — 2] 


M ai pel 


Ma; 
) Tain 


pal 


Mai 


Maj 
} > Lain 


p=l 


Pai 


1 Ba Maj 
Ze = y Zz. Laip- 
Da t=l pol 
In the formula given here, it is assumed that the method of selection 
of secondary sampling units is equivalent to a random sample without 
replacement within the county. It is to be noted that nowhere in the 
formulae do we need the 1945 total number of farms either for the 
counties, the strata, or the state. 
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We see from the formula for the variance of the estimate that the 
first term gives the within-county contribution, while the second term 
gives the between-county contribution to the over-all variance. 


CALCULATIONS AND RESULTS SHOWING GAINS 
DUE TO STRATIFICATION 


For the purpose of evaluating the gains due to stratification, the 
between-county contribution, the second term of o, is used. Calcu- 
lations are given for the square of the coefficient of variation for the 
case of 10 counties from 10 strata and for 20 counties from 20 strata. 
The standard of comparison is the (c.v.)* for the design in which the 
same number of counties is selected with probability proportionate to 
size (p.p.s.) from the state as a whole without stratification, but with 
replacement. In this case, the variance for the standard is simply the 
variance of the selection of one county with p.p.s., divided by the 
number of counties selected. 

Clearly, it would be desirable to have a better standard. A suitable 
design to use as a standard would be one that could be easily carried 
through in practice and yet not be too inefficient. We can expect that 
the design of sampling p.p.s. with replacement is far more inefficient 
than the standard should be. Systematic sampling p.p.s. would prob- 
ably be a good standard, if the order of the counties were assumed to be 
randomized before selecting, but further theoretical work needs to be 
done before this method could be applied in practice. 

The results showing the gains due to stratification are given in Table 
1. The squares of the coefficients of variation for the designs considered 
are given in columns 4 through 7, and the per cent gain due to stratifi- 
cation is given in columns 8 through 10. It is clear from columns 8 and 
9 that both the 10 strata and the 20 strata show large gains over select- 
ing an equal number of counties from the state as a whole without 
stratification. The gains are greatest for the highly correlated items of 
non-white operators, tenants, and croppers. The items showing least 
gain due to stratification are farms reporting telephone, number of 
males 14 years of age and over, and value of land and buildings. Most 
of the crop items are between the two extremes. In all cases, however, 
the gains are considerable, most items showing a gain of over 50 per 
cent in relative efficiency. This means that for most items, 20 counties 
selected from 20 strata would give results at least as good as those that 
would be obtained by selecting 40 counties from the state without 
stratification. 

The question of whether the 20 strata produce gains over the 10 
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strata is answered in column 10. The two designs compared are that in 
which 20 counties are selected, one from each of 20 strata, and that in 
which 20 counties are selected, two p.p.s. with replacement from each 
of 10 strata. The figures show that, in general, gains are greatest for 
those items showing the greatest gains in columns 8 and 9, and that 
although it pays to stratify the 10 into the 20 strata, most of the gains 
are under 50 per cent. Thus, it may be expected that if the 20 strata 
were further divided into, say, 40 strata, there might be some ad- 
ditional gains but again of a lesser degree. On the whole, the data give 
further evidence for the statistical folklore of ‘stratifying to the hilt.’ 

We see then that for preparing estimates for the State of North 
Carolina, if the county is the primary sampling unit, it definitely pays 
to stratify up to 20 strata, probably up to 40 strata, and quite pos- 
sibly, beyond. 

It is to be emphasized that the remarks made here apply to esti- 
mates in a general purpose survey for the state as a whole and are not 
valid for estimates of items appearing in concentrated areas. The 
stratification problem as well as the entire estimating problem is quite 
different in such cases. 


NUMBER OF COUNTIES NEEDED, USING SIMPLE UNBIASED ESTIMATE 


We have just seen that it pays to stratify the counties, and this 
result is not surprising. What is more surprising is that, in spite of 
these gains, 20 counties are insufficient for estimating most items with 
any reasonable degree of accuracy, and even 40 counties are insufficient 
for many items. This is true, if the simple unbiased estimate is used, 
even if the entire county is enumerated once it is selected for the 
sample. These results are readily seen by an examination of the coeffi- 
cients of variation which are presented in Table 2. The calculations for 
40 strata presented in column 6 were obtained as an approximation 
based on the results for 20 counties from 20 strata. The assumption 
made in the calculations is that the same per cent gain in efficiency 
would prevail in stratifying the 20 strata into 40 as was obtained by 
stratifying the 10 strata into 20. The additional gains due to stratifica- 
tion are thus over-estimated, but even in this favorable interpretation, 
a selection of 40 counties from 40 strata is not generally acceptable for 
most purposes. 

Specifically, the figures in Table 2 show that at the level of a 2 per 
cent ¢.v., i.e., the accuracy required to be within 4 per cent of the total 
being estimated 95 per cent of the time, 10 counties would be insuffi- 
cient for estimating any one of the 20 items considered; 20 counties 
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would be sufficient for only 3 of the items; and 40 counties would yield 
reliable estimates for only 1 additional item, or a total of 4 items out 
of the 20 considered. Even if we are satisfied with a c.v. of 5 per cent, 
j.e., an accuracy of estimating the total of an item to within 10 per cent 
of itself 95 per cent of the time, we would be able to estimate only 2 
of the 20 items satisfactorily with 10 counties, 7 items with 20 counties, 


and 13 items with 40 counties. 


DEGREE OF ACCURACY FOR DIFFERENT SIZES OF STRATA 


TABLE 2 








Z 
° 


1945 Total 
to be 
Estimated 


Coefficient of Variation 





10 strata 


20 strata 


40 strata 
(approx.) 





2 


3 


4 


6 





Number of farms 

Number of non-white operators 
Number of tenants 

Number of croppers 

No. males, 14 years old and over 
Work off farm—F. R. 

Work off farm—days 

Farms reporting telephone 

Value of land and buildings 

Total value of products sold or used 
Total value of products sold—F. R. 
Total value of products sold—value 
Dairy products sold—F. R. 

Dairy products sold—value 
Tobacco—F. R. 

Tobacco—acres 

Cotton—F. R. 

Cotton—acres 

Peanuts—F. R. 

Peanuts—acres 


287 ,412 
74,273 

122 ,577 

62 ,687 

395 ,785 
55,212 
9,569 ,537 
14,539 
1,002 ,983 ,012 
592 ,631 ,568 
262 ,084 

488 ,829 ,142 
46 ,958 

20 ,005 ,528 
150,170 
648 , 196 

106 ,513 
714,177 
30,350 

272 ,326 


-03 
17 
-08 
-13 
-03 
-17 
-24 
-27 
-08 
-06 
-03 
-07 
-13 
-29 
-1l 
-12 
15 
+21 
-43 
54 


-01 


-1l 
-12 




















It is to be borne in mind here that no consideration has been given 
to the within-county contribution to the over-all variance. If that were 
added, the results would be even less satisfactory. In other words, our 
conclusions could be stated even more strongly. 

Since it is often considered, however, that 20 counties are an upper 
limit to the number that could be included in most surveys in North 
Carolina, because of cost and administrative limitations, it is necessary 
to examine better estimating techniques and modifications of design. 
We shall consider three additional estimation techniques. 


RATIO ESTIMATES—CALCULATIONS NECESSARY 
In studying the gains due to the use of the ratio estimate, the same 
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basic sampling design is used as before, but instead of the simple un- 
biased estimation equation, we use the ratio estimate. Two kinds of 
ratio estimate are considered. 

(a). Ratio to estimated number of farms at time of survey. We still want 
to estimate, as before, the total of a characteristic, say, number of 
tenants, in 1945. Thus, to estimate z as previously defined, we use 


G 
) Dakas’ 


a=l 


gz! 
a dae gaa 


G 
bm Dadai’ 
a=l 


, 


Estimation equation: <x, 


here the x’s are defined as before, the y’s are similarly defined but 
with respect to the 1945 number of farms, and y itself is the total 
number of farms in 1945, obtained from the best available source, but 
not based on the sample. Thus, we see that the numerator and denom- 
inator are both random variables. 

Bias of the ratio estimate: Since E(x'/y’)#Ex'/Ey’, in general, the 
ratio estimate is biased. The bias in estimating +/y can be shown to be 
approximately equal to 


x 
— (C.V.y? — C.V.zy’). 


If this expression is calculated, and the result multiplied by 100, the 
bias in estimating z is then expressed as a per cent of y. 

Coefficient of variation: An approximation to the coefficient of varia- 
tion of the estimate is given by the well-known relation 


(¢.V.2,)? = (¢.V.e)? + (¢.V.y)? — 2(C.v.2ry). 


Now we have already calculated the (c.v.)? of the numerator and 
denominator, and the only additional computation that is necessary 
is to calculate the covariance term. The three terms are then combined 
to give the over-all (c.v.)?. The results of these calculations are given 
in Table 3, but we shall consider the calculations for the second type 
of ratio estimate, and then compare the results of both of them. 

(b). Ratio to total of same characteristic at last census. Again, we are 
still interested in estimating z, as in the first ratio estimate considered 
above. The estimation equation here, however, is a little different, 
although its formal representation is the same as that given for the 
previous case. Here, however, the y’s refer to the 1940 totals of the 
same characteristic used in the z’s for 1945. Thus, the effect of this 
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estimation equation is to estimate the change in a given characteristic 
over a period of time and to apply this estimate of change to the over-all 
base figure. 

Again, the (c.v.)? has the same formal representation as in the pre- 
vious case. Here, however, the (c.v.)? for the denominator has not 
been calculated before, so that additional computation is necessary for 
both the (c.v.)? of the denominator and the covariance term. Again, the 
three terms are combined to give the over-all (c.v.)?. The results 
of these calculations are given in Table 3. It will be seen that al- 
though this ratio estimate is more laborious to calculate, the gains 
achieved through its use are so substantial that it would seem advisable 
to use it wherever possible. 


RATIO ESTIMATES—INTERPRETATION OF RESULTS 


The work has been completed for only half of the items considered 
in the study. Furthermore, calculations were made only for the case of 
selecting 20 counties from 20 strata. On the basis of the figures in 
Table 3, it is readily seen that the first ratio estimate gives only 
moderate improvements in most cases and can give even a slight loss 
in a few cases, while the second ratio estimate gives considerable gains 
in all cases considered, enough so that 20 counties are sufficient to give 
estimates within a 5 per cent c.v. for almost all items. 

In column 5, we have the (c.v.)? for the unbiased estimate, while in 
column 7, we have the (c.v.)? for the estimate based on the ratio to the 
1945 number of farms. Comparison of the figures in these two columns 
shows that for most items, the ratio estimate shows a gain of about 10 
to 20 per cent over the unbiased estimate, but that for the crop items 
of tobacco and cotton, the ratio estimate shows either only a slight 
gain or even a loss with respect to the unbiased estimate. Thus, except 
for estimating socio-economic characteristics like non-white operators 
or tenants, the ratio estimate to number of farms at the time of the 
survey is not to be recommended. 

The (c.v.)? for the second ratio estimate, ratio to the total of the 
same characteristic at an earlier census date, is given in column 8. 
Comparison of these (c.v.)? with those for the unbiased estimate and 
ratio to farms at survey time shows tremendous gains over these two 
estimates. In fact, examination of the c.v. for the ratio to 1940 estimate, 
given in column 10, shows that almost all items are estimated well 
within the 5 per cent level. The only item which remains poorly esti- 
mated, in spite of tremendous gain of the ratio estimate over the 
unbiased estimate, is work off farm, both for farms reporting and days 
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worked. Examination of the individual county contributions to the 
over-all variance showed that a few counties like New Hanover, where 
work connected with the war effort was available, contributed an un- 
usual amount to the between-county variance. Thus, the ratio (b) 
estimate is very good, except on items that are greatly affected by 
disturbances over a period of time. 

Calculations for the ratio to 1945 number of farms estimate, similar 
to those for the ratio to 1940 total of same characteristic, are given in 
column 9. The figures show that hardly any items can be estimated 
within the 5 per cent level of accuracy. Thus, the ratio (a) estimate is 
in general, not enough of an improvement over the unbiased estimate 
to make 20 counties yield satisfactory estimates. 

The per cent bias of the ratio estimate was calculated only for the 
ratio (b) estimate, i.e., ratio to 1940 total of the same characteristic. 
The figures are given in column 11 of Table 3. The highest per cent 
biases shown are about one-tenth of one per cent and most of them are 
less than five-hundredths of one per cent. Thus, the bias in the ratio 
estimate considered here, is so negligible as not to be of any concern 
in using the ratio (b) estimate. 


COMPARISON OF REGRESSION ESTIMATE WITH RATIO ESTIMATE (b) 


Suppose that in order to estimate the quantity, zx, we use the re- 
gression equation 
Orz'y’ 


x'r = z'- 





> iy’ — yy), 
Oy’ 
where x’ is the unbiased estimate of z and y’ is the unbiased estimate 
of y, as before, where y refers to the total of the x characteristic at the 
earlier census date; and it is assumed that the regression coefficient, 
O2'y'/oy’, is accurately known. 
It is shown in the appendix that the relation between the (c.v.)? for 
the ratio estimate and that of the above regression estimate is given by 





(c.v.2,)? = (€.V.2'p)? + (C.V.sy" — C.V.y?)’, 


V.y’ 3 


where x’, refers to the ratio estimate and x’, refers to the regression 
estimate. Thus, if we calculate the second term on the right, we have a 
measure of the gain of the regression estimate over the ratio estimate. 
These calculations were performed, and the results are shown in Table 
4. In column 4, we have A, or the second term of the expression above, 
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which gives the exact difference between the (c.v.)? for the ratio and 
regression estimates; column 5 gives the (c.v.)? for the ratio estimate; 
and column 6 gives the ratio of these two columns, expressed in per 
cent. The figures in column 6 then show that, for the items considered, 
the greatest gains that can be expected from the regression estimate 
over the ratio estimate are just about 5 per cent, while most of the 
gains shown are considerably less. Thus, the decision to use one or the 
other of these estimates depends mostly on convenience. 


TABLE 4 
RELATIVE EFFICIENCY OF REGRESSION ESTIMATE OVER RATIO ESTIMATE 








1945 Total to Per Centt 


Item be Estimated a° (c.v.2',)*t Gain 





[~| # 


2 6 





Number of non-white operators 
Number of tenants 

Number of croppers 

Work off farm—F. R. 

Work off farm—days 9 ,569 ,537 
Value of land and buildings 1,002 ,983 ,012 
Tobacco—F. R. 150,170 
Tobacco—acres 648 ,196 
Cotton—F. R. 106 ,513 
Cotton—acres 714,177 


COON OAR WH 


_ 




















1 
%A=z . —C.V.y! 
6 (c.v.y*)? eedinadd 


t (c.v.2’,)? =(c.v.)* for ratio estimate to total of characteristic at previous census, 


A 
t Per cent gain of regression estimate over ratio estimate = ower 
-Vig's 


§ Less than .0000005. 
|| Less than 0.05. 


STABILITY OF (C.V.)? OVER 5-YEAR PERIOD 


As part of the calculations for the (c.v.)? for the ratio estimate based 
on the 1940 total of the same characteristic, we have the (c.v.)? of the 
denominator of the ratio, which is the same as the (c.v.)? for the un- 
biased case for 1940. These figures are given in Table 3, column 6. 
Comparing these figures with those in column 5, which give the (c.v.)? 
for the unbiased estimate for 1945, we see that the two sets of figures 
are, in general, within 10 or 20 per cent of each other, so that it would 
be possible to make a fair estimate of the c.v. of the numerator of the 
ratio estimate from a knowledge of the c.v. for the denominator. The 
relative stability of this c.v. over a five-year period is important, since 
at the time of a survey, data would, in general, be available only for 
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the denominator, and not for the numerator. The figures show that 
only for special items, unusually affected by critical changes during 
the interim period, are the c.v.’s for the base period poor estimates of 
the ‘true’ c.v.’s needed to plan the survey. In fact, the only items that 
are not estimated well are again the work off farm items, which 
showed considerable changes in work habits in a few counties but not 
in others. Thus, the calculations show that the c.v. is relatively stable 
over a five-year period, except in a few special cases. 


POSSIBLE USE OF NUMBER OF FARMS EXCLUDING CROPPERS 
AS A MEASURE OF SIZE 


In this study, the 1940 number of farms is used as the measure of 
size of the primary sampling unit. Another suggested measure of size* 
is farms minus croppers. Calculations are presented below using 1940 
number of farms minus croppers as the measure of size for some of the 
items considered in this study. The calculations are carried out for the 
design in which 20 counties are selected p.p.s. from 20 strata. The 
coefficients of variation are presented in Table 5. Comparative figures 
are given for the use of total number of farms as a measure of size. 


TABLE 5 
COEFFICIENTS OF VARIATION, COMPARING TWO MEASURES OF SIZE 








(c.v.)? for 1940 Meas. of Size 





1945 Total to be 
Estimated 


Zz 
° 


No. of Farms Total No. of 
— Croppers Farms 





2 3 5 


[=| 





Number of non-white operators 
Number of tenants 

Number of croppers 

Work off farm—F., R. 

Work off farm—days 

Value of land and buildings 
Tobacco—F. R. 
Tobacco—acres 

Cotton—F. R. 

Cotton—acres 


Sere anoarh WN 


— 








74,273 
122,577 

62 ,687 

55,212 

9,569 ,537 
1,002 ,983 ,012 
150,170 

648 ,196 

106 ,513 
714,177 








-0063 
-0017 
-0048 
.0119 
-0231 
-0027 
-0048 
-0058 
-0075 
-0182 





The figures in Table 5 show that, for every item considered, except 
the two work off farm items, the (c.v.)? is higher when the measure of 
size is taken so that the croppers are subtracted from the total number 
of farms. For estimating such socio-economic characteristics as the 





* See, for example, thesis by Margaret Stone, “Efficiency of National Samples Having the County 
as Sampling Unit,” Iowa State College, 1946. 
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number of non-white operators, tenants, and croppers, the (c.v.)? is 
just about double what it is when the total number of farms is taken 
as the measure of size. Even for estimating acreage of cotton and 
tobacco, the (c.v.)? is considerably increased by leaving out the crop- 
pers from the total number of farms. For the work off farm items, 
however, the (c.v.)? is less when the croppers are omitted from the 
measure of size. 


CONCLUSIONS 


Calculations in this study show that the between-county contribu- 
tion to the variance, in the case of the simple unbiased estimate, is so 
large for most socio-economic and agricultural items, that not even 
40 counties completely enumerated out of the 100 counties in North 
Carolina would give estimates to within 10 per cent of the item 95 
per cent of the time for almost half the items considered. Since 20 
counties are considered a large number to include in most surveys of 
the State, three methods for improving the estimation equations are 
considered. 

The ratio estimate to the number of farms at the time of the survey 
gives some improvements in socio-economic items but not in agricul- 
tural items. The ratio estimate to the total of the same characteristic 
at an earlier census date gives such tremendous gains that 20 counties 
are sufficient to give satisfactory results for almost all items. The 
regression estimate shows only slight gains over the second ratio 
estimate considered. 

On the whole, the between-county variances obtained in this study 
are so large as to indicate that the county is not a very efficient sam- 
pling unit for the purpose of making state estimates. The minor civil 
division should be a more efficient primary sampling unit for state 
estimates, but as yet no results are available for evaluating such gains. 

The gains from the use of the minor civil division together with the 
gains from such techniques as area substratification and the complete 
enumeration of large farms may well be expected to yield satisfactory 
estimates. 


APPENDIX 


RELATION BETWEEN REGRESSION AND RATIO ESTIMATES 


To derive the expression for A which measures the relation between 
the regression and ratio estimates, we first give an expression for the 
variance of the regression estimate, assuming the true regression co- 
efficient is known. 
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To estimate a total, z, we use 


’ s'y’ 
eo (y ia y); 


Cy? 


Estimation equation: x’ 


where x’ and y’ are unbiased estimates of the totals, z and y. Since 
we are using the true regression coefficient rather than that of the 
sample, the variance obtained by this formula will be somewhat less 
than if the sample regression coefficient were used. In cases where the 
ratio and regression estimates are about equally efficient, however, 
there seems to be no need for greater exactness in the variance of the 
regression estimate. 

Variance of estimate: Since Ex'=zx and Ey’=y, we have Ex’ ,=z2, 
and it is easy to see that 


(oxy)? : 


2 
R 


Sam 


Cz = Cz 


Oy’ 


RELATION BETWEEN THE (C.V.)? OF THE REGRESSION 
AND RATIO ESTIMATES 


To obtain the (c.v.)? of the regression estimate, we divide oz,’ by 
x’, obtaining 


(c.v.2,)? = ll am (ce)? 
. 2’p = zay? 


For the ratio estimate, it is known that, approximately, 


2 2 
Cz Cy 
(¢.v.2,")? i cami -+- ies _ 


x? y? xy 


Qozy! 


Therefore, 


(c.V.s7,)? = (C.V.s"2)? + on [e.v.ery — (C.V.y)?]?, 
(c.v.y")? 


and the second term on the right, 
1 
= (C.V.yr)? 


gives the expression for the difference between the squares of the 
coefficients of variation of the ratio and regression estimates. 


[c.v.2ry? om (c.v.y) . ] *, 





INTERNATIONAL ORGANIZATIONS AND 
SOVIET STATISTICS 


Naum JAsny 


The compilation and analysis of economic statistics for the 
countries of the world by various international organizations 
is of importance to research in international economics. How- 
ever, the present situation is that statistics from certain coun- 
tries, particularly the Soviet Union, when available at all, are 
not only not comparable with those of the rest of the world, 
but are issued for propaganda purposes and cannot be con- 
sidered as statements of existing conditions. The problem 
which must be faced is whether to discontinue the publication 
of such statistics entirely in world reports, or whether it is 
possible to publish them in such a form that their weaknesses 
are clearly evident. 


T Is no exaggeration to say that empirical research in international 
| economics was very greatly facilitated when the international or- 
ganizations began first to compile, and then to analyze, statistics for all 
or most countries of the world. But the compilation and interpretation 
of statistics of the mary countries by the international organizations 
met with various difficulties. A major difficulty was and is that the 
statistics of various countries, and the various statistics of the same 
country, vary greatly in reliability. Yet for some reason or other, prob- 
ably a good one, the international organizations decided to reproduce 
all statistics without comment, strictly as reported to them by the 
statistical organizations of the countries involved—indeed, even to use 
all of them indiscriminately in analysis as if they were equally reliable. 

For example, Argentina—the deciding factor, for decades, in the 
world corn market—for years used to put out greatly underestimated 
figures for her corn output. In the second half of her shipping season, 
she frequently continued to export heavily, although according to her 
crop statistics no corn should have been available for exportation. 
While everybody interested was well aware of the true situation from 
the excellent reports in Times of Argentina and other sources, the Inter- 
national Institute of Agriculture not only carried the incorrect esti- 
mates, but based its appraisals of the world corn market upon them. 
Several other such phenomena could be cited, although none possibly 
was as significant as the situation with the corn market. 

Such deficiencies in the published statistics of the international or- 
ganizations, however, seem almost negligible as compared with the con- 
fusion created at the end of the ’twenties, when the Soviet Union de- 
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clared “neutral” statistics of simple fact-finding type to be anathema, 
and ordered its statistics henceforth to become socialist, Marxian, 
proletarian, class, and what not. They were to help in the socialist re- 
construction of Soviet life, in the class struggle. 

Suppressing unfavorable statistics, picking unrepresentative years, 
comparing data pertaining to different territories—al] these have been 
among the minor means.’ The all-important indexes of industrial pro- 
duction were prepared in terms of so-called “unchangeable 1926-27 
prices,” which implied an upward bias of at least 334 per cent in 1928- 
37, of much more than this in 1937-40 and especially in 1940-45, and 
of a great deal in 1946-48. Productivity of labor computed from the 
industrial output, in terms of those prices, and from the stated labor 
force, repeats fully the exaggeration of the output figures; and, there 
may also be present an additional upward bias in that industrial output 
includes that of the numerous inmates of concentration camps, while 
the stated labor force includes only part of these or does not include 
them at all. Since the bias of the “unchangeable 1926-27 prices” affects 
finished industrial products much more than raw materials, the national 
income from industry is more exaggerated than the industrial output 
itself. 

In 1933, the Soviet Union turned to estimating her grain yields and 
crops in the fields prior to harvest. The yields and crops thus estab- 
lished are called biological, on the root, and, in the law itself, factual. 
In 1939 the system was extended to cover other crops. The law empha- 
sized that even beets and potatoes and their parts, remaining under 
ground, were to be included in the official estimate. Analysis by the 
present writer showed that in order to arrive at the harvested yields 
and crops, the standing official estimates of grain yields and crops of 
1933-39 would have to be cut down by about 20 per cent.? Since 1947 
the crop estimating has been fully concentrated in the hands of the 
central government, and the needed discount of all official crop esti- 





1 The real nature of Soviet statistics has been recognized for a long time, but only by a few. Most 
of the credit goes to Colin Clark (A Critique of Russian Statistics, London, 1939), whose standing estimate 
of the increase in Soviet national income in 1928-38 is 56 per cent, as against the official figure of 320 
per cent. By his Bulletin and the book, Russlands Volkswirtschaft unter den Sowiets, Zurich and New 
York, 1944, 8. N. Prokopovicsz contributed much to the knowledge on the subject. Julius Wyler (“Na- 
tional Income of Soviet Russia,” Social Research, XIII, Dec. 1946, pp. 501-18) unmistakably said what 
he thinks of Soviet statistics both by his findings and by the title of the first section: “A Statistical 
Puzzle.” See also writer’s “Intricacies of Russian National-Income Indexes,” The Journal of Political 
Economy, LV, No. 4, August 1947, pp. 299-322. The February issue of The Review of Economic Statistics 
contains the write:’s “Soviet Statistics,” a study originally written early in 1948. It discusses the 
shortcomings of Soviet statistics at greater length than was needed for the specific purpose pursued 
in the present study. 

2 See the writer's Te Socialized Agriculture of the USSR. Plans and Performance (Stanford, 1949) 
Chapter XXII and others. 
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mates made a substantial jump—for grain probably to almost 30 per 
cent of the official estimate. Those never-harvested portions of the crops 
are irrationally included in the estimates of gross agricultural produc- 
tion, and are transferred from there to national-income computations, 

Space prevents consideration of numerous other statistical devices 
similarly helpful to socialist reconstruction. It must be mentioned, how- 
ever, that since the war the Soviet Union has not released anything 
which, at least formally, answers the requirements of statistics, i.e., 
expresses the amounts stated in some such units of measure as tons, 
bushels, or acres with the data given by administrative regions and 
districts and released by organizations and in publications organized 
for this purpose. Rather, the data are published in the form of per- 
centages of the previous year or quarter-year, or of prewar, without 
statement of the standing estimate for the prewar output in the present 
territory or indeed in any territory. 

Thus far, the international organizations have continued to accept 
and publish those “class” statistics without contradiction. Thus they 
have been instruments for spreading such numbers all over the world. 
Economic literature on both sides of the Atlantic is flooded with incor- 
rect official Soviet statistics. The responsibility of the international 
organizations for this is the greater, the larger is the use of their publi- 
cations and the wider the respect for them. 

The Soviets were fully aware of the propaganda value of these or- 
ganizations. Indeed, realizing that the Soviet people are familiar with 
the “class” nature of their country’s statistics, the Soviets were eager 
to re-import their own statistics under the independent brand of the 
international organizations. 


THE INTERNATIONAL INSTITUTE OF AGRICULTURE IN ROME 


For some years after 1933, the International Institute of Agriculture 
continued to carry the Soviet “factual,” biological, or on-the-root 
yields and crops as if they were perfectly comparable with the crops of 
other countries. Obviously, not only the Soviet data, but also those of 
total world production, the share in it of the Soviet Union, and the rela- 
tions among the various commodities, had become wrong—the more so 
the larger was the share of the USSR in world output of each com- 
modity. Of rye, which the USSR produces heavily, the world output 
and the share of the USSR in it were exaggerated by more than 10 per 
cent through the exaggeration of Soviet output. The 1937 world out- 
put of barley was larger than that of rye, but appeared smaller in the com- 
pilation of the International Institute in Rome for the same reason. 
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Not before its 1938-39 yearbook did the Institute include a warning 
with reference to Soviet yields, in a carefully worded footnote which by 
no means disclosed the real situation. The footnote was as follows: 

The method of collecting statistics having been modified in 1934 in 
Belgium, in 1936 in Bulgaria, and in 1933 in the Soviet Union, the figures 


from those years onward are not strictly comparable with those of the pre- 
ceding years. 


Not an inkling was given that the Soviet estimates had become esti- 
mates of unharvested crops and were therefore not comparable with the 
estimates of harvested crops of all other countries. The qualification “not 
strictly” before comparable in the citation is also worthy of note. 

Fortunately, the Soviet Union did not provide data for every year 
covered in tabulations in the International Yearbooks of Agricultural 
Statistics, and the Institute began to publish a double set of world 
totals, with and without the USSR. At least the latter were reasonably 
dependable. 

Experience shows that footnotes are an inadequate protection. The 
fact is that even the statisticians of the League of Nations did not 
notice, or else disregarded, the footnote in the International Yearbooks 
of Agricultural Statistics (see below, pp. 55-56). 


LEAGUE OF NATIONS 


It was difficuit to pursue research in world industrial output without 
the indexes of the League of Nations, and it certainly was regrettable 
that in the last years of their publication the indexes were in such shape 
that their absence would have served as well as their presence. Accord- 
ing to the League of Nations, the share of the Soviet Union in the world 
industrial output was equivalent to 18.5 per cent in 1936-38; the 
Union’s output is supposed to have amounted to 57.5 per cent of the 
industrial output of the United States.® 

The volume U.S.S.R. and the Capitalist Countries (Moscow 1939) 
may not have been taken as an official Soviet publication by the League 
of Nations, but it is so used by all who are familiar with Soviet prac- 
tices. Although that volume is not marked as a second edition, it is in 
fact the second edition of U.S.S.R. and the Capitalist World (Moscow 
1934), which carried at its head “Institute of Economic Investigations 
of the Gosplan” (the 1939 edition gives the Gosplan merely as pub- 
lisher). The preface to the 1939 edition (p. xv) contains a reference to 
“preceding compilations,” one of which certainly was the 1934 volume. 





3 Industrialization and Foreign Trade, League of Nations, 1945, p. 13. 
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CHART 1 


MOVEMENT OF MANUFACTURING PRODUCTION OF THE WORLD AND ELEVEN 
PRINCIPAL INDUSTRIAL COUNTRIES.* 
(WORLD IN 1913 =100) 
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® From Industrialization and Foreign Trade, League of Nations, 1945. 
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CHART 2 


SOVIET INDUSTRIAL OUTPUT 1928-1940.* (1928 =100 FOR WRITER’S CURVE; 
1929 =100 FOR OTHERS.) 
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* Soviet official data and those of the German Institute of Economic Research from Statistical 
Yearbook of the League of Nations, 1941-1942, p. 164. The League of Nation’s indexes from Industrializa- 
tion and Foreign Trade, op. cit., p. 134. The writer's indexes, prepared more than a year ago, later 
were changed slightly with the progress of the analysis. 
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Y. Joffe, a prominent Soviet analyst, apparently a staff member either 
of the above institute or of the Institute of Economics of the Academy 
of Sciences of the USSR, was responsible for both volumes. The ma- 
terial compiled in U.S.S.R. and the Capitalist Countries, as well as other 
data obviously prepared simultaneously with it, indeed based upon it, 
has been used without stating source by such writers as M. Kalganov, 
the official Soviet analyst of national income‘ and A. Notkin, another 
prominent official analyst.® 

According to U.S.S.R. and the Capitalist Countries (p. 8), the share of 
the USSR in world industrial output in 1937 was equivalent to 13.7 per 
cent as against 18.5 per cent on the average of 1936-38 according to the 
League of Nations. The relation of the industrial output of the USSR 
to that of the United States in the same years was estimated at 32.7 
per cent by the Soviet source and at 57.5 per cent by the League of 
Nations. These certainly are great differences, likely to affect funda- 
mentally the appraisal of the Soviet industrialization drive. 

It seems rather unlikely that the League of Nations failed to see 
U.S.S.R. and the Capitalist Countries, or the article by Notkin, who on 
p. 48 repeated the figure of 32.7 as the percentage relation of USSR 
industrial output to that of the United States in 1937. The League of 
Nations might not have realized that if, according to Kalganov,*® the 
national income of the USSR in 1937 was 46.6 per cent of that of the 
United States in 1929, the industry of the much less industrialized 
USSR would not have been equivalent in 1936-38 to 57.5 per cent of 
that of the United States. However, the League of Nations knew that 
the Soviet indexes of industrial output had a strong upward bias. 
Otherwise it would not have included in its 1941-42 yearbook, side by 
side with the official Soviet indexes, the considerably lower ones of the 
private German Institute of Economics.’ 

Chart 1 is reproduced from Industrialization and Foreign Trade by the 
League of Nations (p. 12). Chart 2 shows the indexes of industrial out- 
put in the Soviet Union from 1928 or 1929 to 1938-40 as computed 
officially, by the League of Nations, by the German Institute, and by 
the present writer. The relatively small differences between the writer’s 
estimates of the increase in Soviet industrial output during the ’thirties 





4 “National Income of the U.S.8.R. and the Basic Economic Task,” Problems of Economics, monthly 
of the Academy of Science of the USSR, Institute of Economics, April 1940, p. 111. 

5 “Relations between Means of Production and Consumption,” Problems of Economics, November 
1940, pp. 48-49, and others. 

6 Loe. cit. 

1 Statistical Yearbook of the League of Nations, 1941-42, p. 164. 
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and those of the German Institute are due to the fact that the most 
important part of the writer’s estimates were obtained directly from 
Soviet sources (interpretation of these), while the German Institute 
apparently made its own computations. Moreover, at least part of its 
estimates were completed by the German Institute before publication 
of the Soviet data utilized by the present writer. The indexes of the 
League of Nations on the other hand point to an increase in Soviet 
industrial output, even greater than that implied in the official data. 
The reason is that the League’s data are based not on the value of the 
total industrial output but on value added by the industry. The 
League of Nations certainly had good grounds for accepting this pro- 
cedure as the general rule, but it should have realized that in the spe- 
cific case of the USSR the procedure inflated further the considerably 
inflated trend of the official indexes of industrial output. 

In an important monograph on the recovery of European agriculture 
from the effects of World War I, the League of Nations gave grain pro- 
duction in various parts of Europe as follows (million metric tons) :* 
Increase in 

Per Cent 


Western Continental Europe 64.85 65.64 1.2 
Eastern Continental Europe 48.13 53.64 11.0 
Russia—USSR 63 .32 87 .89 38.8 


Thus the USSR was indicated to be not merely the only country with 
an unusually large increase in industrial output in postwar years, but 
also to have been far ahead in the output of cereals—the heavily 
dominant farm product of that country. This stands even with the fol- 
lowing reservation on p. 20: “This figure [the increase of USSR output 
by almost 40 per cent] would be reduced to slightly below 30 per cent if 
the pre-war estimates were, which is not impossible, 10 per cent too low 
compared with postwar crop estimates.” This cut, it should be noted, 
was made only on the consideration (p. 11) that “the 1909-13 data 
would have been higher, had the same methods of estimating crops 
been employed then as have been employed since the middle of the 
twenties.” No allowance whatever was made for the turn to “biological” 
estimates after 1933, in spite of the footnotes in the Yearbooks of Agri- 
cultural Statistics. The League of Nations (zbid., p. 20) even brings forth 
explanations for the large increase in Soviet grain output after World 
War I in “the application on a large scale of modern mechanized pro- 
duction methods” and “very significant advance in plant physiology.” 


Area 1909-13 1934-38 








8 Agricultural Production in Continental Europe during the 1914-18 War and the Reconstruction 
Period, League of Nations, 1943, p. 9. 
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The latter was probably Lysenko’s yarovisatsiya, a procedure of speed- 
ing up maturity, which failed of acceptance outside of the USSR, and 
within that country has probably to thank the political influence of the 
discoverer for its relatively wide acceptance. 

Before the Soviet statistics were given the assignment of helping in 
the drives as true class, Marxian, etc., statistics, the level of grain out- 
put in 1913 was estimated by the Gosplan (State Planning Commis- 
sion) at 81.6 million tons.* An analysis by the writer shows that from 
the average official 1934-38 crop of 95.5 million tons a discount of 
roughly 20 per cent must be made to bring it down to the basis “har- 
vested.”!° Thus the average harvested crop in 1934-38 was 76.4 million 
tons, i.e., it was smaller than the 1913 output (metric tons throughout). 

When the League of Nations was publishing its monograph (1943), 
the by no means complicated analyses of grain crops and other food 
products from the point of view of utilization were being made in 
hundreds of government and private places, including international 
organizations such as the Combined Food Board. Had the analysts of 
the League of Nations approached the problem of Soviet crops with this 
implement, their figures would promptly have fallen to pieces. In 1909- 
13, the pre-1939 Soviet territory exported about 11 million metric tons, 
There were some 20 million fewer horses in 1934-38 than 1909-13 and 
less also of all kinds of productive livestock; this decline involved a 
saving of 6-7 million tons of grain. The population was only about 30 
million larger in the later period; with the same rations—which were 
certainly not available—an additional requirement of only 7.5 million 
tons, was involved. With these facts in mind, the 24.6 million tons of 
additional production in 1934-38 according to the computation of the 
League of Nations just does not make sense. Yet the inescapable eco- 
nomic, social, and political conclusions from the findings of the League 
are certainly far-reaching. 

Frank Lorimer’s study on Soviet population published by the League 
of Nations" is likewise noteworthy. The question may never be settled 
whether there were 5 or 10 million persons in concentration camps of 
Soviet Russia in 1939. It is nevertheless an odd experience to read a 
book of 289 pages in folio with over 100 numbered tables, 31 charts and 
22 double-page maps, and not to find mention of the phrase “concentra- 
tion camps.” Were their inmates counted by the 1939 census at all? If 





® Five-Year Plan of Development of National Economy of the U.S.S.R., 3rd ed. (Moscow, 1930), I, 
p. 144, 

10 The Socialized Agriculture of the U.S.S.R., pp. 545 and 548. 

1 The Population of the Soviet Union. History and Prospects, League of Nations, 1944. 
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so, were they included under rural or urban population? Were they in- 
cluded with the labor force? An answer to all these relevant questions 
is sought in vain. 


INTERNATIONAL LABOR OFFICE 


After having read A. Yugow’s 1947 article “Reconstruction and Re- 
conversion in the U.S.S.R.”” one has the impulse to check whether the 
International Labor Office was really the publisher, and, if so, if there 
was at least a statement that the Office accepted no responsibility for 
signed articles. So far as Yugow’s text is concerned there is not a word 
which could not have been written by an editor of Pravda. 

Thus, everything is running untrammeled according to plan in the 
Soviet Union. Reconversion to civilian needs is proceeding under the 
most favorable conditions. “Full employment is assured” (p. 76); “Re- 
conversion is not accompanied by the struggle of various social classes” 
(p. 76); “No superfluous plants, unused railways, unnecessary ma- 
chinery and equipment” (p. 64); etc. It was of course not believed 
worth mentioning that reconversion was considerably slowed down by 
continued armament production and that specifically the continued 
utilization of tractor factories for tank production resulted in a negligi- 
ble output of tractors in 1945 and 1946 and that the shortage of farm 
machinery considerably contributed to the small output of food in the 
famine-year 1946. The phony “unchangeable 1926-27 prices” are, it 
would appear, true measures of economic progress. In 1950, industrial 
output will exceed that of 1940 by 48 per cent (p. 65) and even the 
agricultural output will be larger by 27 per cent (p. 68). There is no 
trace of inflation" and of course there is 100 per cent co-operation from 
the population—even in the unpaid Saturday-Sunday work (p. 69). 

Repeated in Yugow’s article is even the famous lie that “the number 
of workers from the villages dropped considerably [in the late ’thirties] 
as a result of improved conditions of life and work in the collective farm 
village” (p. 70). Actually, a considerable improvement occurred only as 
compared with the early ’thirties, which included the winter 1932-33 
when millions of peasants died from starvation. There was no improve- 
ment as compared with the pre-collectivization status. 

On page 73 Yugow enumerated in detail the considerable reductions 





12 Internutional Labour Review, LV, January-February 1947, pp. 62-76. 

8 The law of December 13, 1947 disclosed the exietence of hundreds of billions of rubles in the 
hands of the population in excess of the needs of circulation. Yugow faithfully continued to deny in- 
flation until the very day of issuance of the law. See his article “What Happened to the Soviet Ruble?” 
Russkii Golos, December 13, 1947. Russkii Golos is for all purposes a Communist paper in New York 
and Yugow its regular contributor. 
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in retail prices in 1945-46 and the conclusion is drawn that the reduc- 
tions “have substantially raised the level of real wages of the working 
masses.” He failed to mention that the reductions involved the prices 
in government commercial stores, where the prices were so stupendously 
high before the reduction that the goods remained outside the reach of 
the workers, and continued so even after all reductions. A substantial 
improvement of real income may have been involved only for the rela- 
tively few at the top, to those who are not ordinarily included with 
“working masses” outside of the USSR. Yugow neglected to mention 
that effective September 1, 1946, the prices of rationed food were in- 
creased almost three-fold.“ 

In its next issue, International Labour Review had a report “Social 
Insurance in the Soviet Union,” significantly starting with the words 
(p. 261): “The Russian Social Insurance Scheme has a number of out- 
standing features, which, until the last year or two, were to be found 
in it alone, but some of them have now been incorporated in other na- 
tional schemes.” In the section “Family Responsibility Provision” it 
was neglected to mention that the right of monthly allowances is re- 
stricted to the second to fourth age years of the last child, the one who is 
the basis for computing the benefit payments. While the expenses in the 
first year of the child are covered by lump sums provided separately, a 
woman with five children 5 to 12 years of age is not entitled to any pay- 
ments. The puzzling age restriction was rather widely commented 
upon in the press and should not have escaped the attention of an in- 
ternational journal. Also unmentioned was the fact that the benefit 
payments established in 1944 were effectively reduced severely by the 
near-trebling of the retail prices of rationed food in government stores 
in 1946—without any adjustment of the payments. Moreover, after 
January 1, 1948, the payments were cut in half."* There is little security 
in such security payments. 


UNITED NATIONS 


Gunnar Myrdal, Economic Secretary of the Economic Commission 
for Europe of the United Nations, writes in the preface to A Survey of 
the Economic Situation and Prospects of Europe in 1948 :"" 


Unfortunately, the lack of information concerning economic conditions in 
certain countries and the varying character of the information that is avail- 
able on others has not made it possible to include all the countries of Europe 





4 “Soviet Union: Trends in Prices, Rations and Weges,” Monthly Labor Review, July 1947, p. 5. 
6 International Labour Review, LV, March-April 1947, p. 271. 

16 Notes on Labor Abroad, U. 8. Department of Labor, April 1948, pp. 42-45. 

17 Geneva, 1948, p. 111. 
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in the same comprehensive treatment. In particular... the nature of the 
available information concerning the economic situation and developments 
in the Union of Soviet Socialist Republics was such that it could not be 
generally assimilated with the material relating to other European coun- 
tries... . The available information on the current situation of the Soviet 
Union and its development plans are, however, summarized separately in 
Appendix A, IIT. 

One can be reasonably certain that the famous Swedish economist 
was not actually unhappy that the paucity of data released by the 
Soviets after the war—they cannot be called statistics—made it possi- 
ble to isolate the reasonably reliable data for most other European 
countries from contamination of propaganda material. Yet, the Soviet 
Union being a member of the United Nations, the Soviets came fully to 
their rights in Appendix A, III. This Appendix is not a simple repro- 
duction of Soviet data, for which the Soviets alone could be believed 
responsible; the evidence is presented in the form of an article, which 
obviously is the responsibility of the United Nations. 

There are only a few minor things in this Appendix which would not 
have passed Moscow censorship. Moscow, for example, might have ob- 
jected to footnote 1 on page 149, which points out the familiar upward 
bias of the indexes of production calculated at 1926-27 prices. The 
footnote, however, is so carefully worded that the reader would never 
visualize exaggeration of the order of magnitude of 100 per cent actually 
involved. The Soviets also might not have been pleased that the rates 
of fulfillment of the 1946-47 plans were shown (p. 154). But otherwise 
the article proceeds strictly on the Soviet pattern of comparing goals 
of the current plan with attainments during the preceding plans. Thus 
not only are the goals of the current plan made to appear particularly 
large, but also the mostly large failures of the preceding plans are 
concealed. 

The Soviet Union has added considerably to her territory since 1939. 
The Soviets have never stated the territory to which their 1939 and 
1940 data pertain. Data for the years before 1939 clearly covering only 
the pre-1939 territory, those for 1939 and 1940 pertaining to an un- 
known territory, and those for 1945 and later years for the postwar 
territories are indiscriminately compared in Soviet statistics and other 
official pronouncements without mention of the fact that they pertain 
to different territories. The publication of the Economic Commission 
for Europe here analyzed faithfully follows the same pattern. On page 
152, for example, the goals of the 4th Plan are related to the goal of the 
3rd Plan. Disregard of the changes in territory went so far that on page 
145 one can read: “Between 1942 and 1944, the arable land in the un- 
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occupied regions increased by over 12 million hectares.” The acreage in 
the territory which was unoccupied in 1942 possibly even declined be- 
tween 1942 and 1944. 

Data on agricultural production in the report includes portions never 
harvested, for example on page 153, without mention of this important 
fact. On page 148 the report gives a table (No. XVI) entitled, “The 
Recovery of Agricultural Production in the U.8.8.R.” All that one finds 
in the table are the percentages by which the 1947 crops are supposed 
to have exceeded those of 1946. Every serious student knows, however, 
that there was no such increase as 58 per cent in grain crops from 1946 
to 1947 or, for that matter, no such increase in sunflower-seed output 
as 79 per cent. A large fraction of these increases represents merely the 
reorganization of the crop-estimating organization mentioned above 
(see p. 50). 

Even if each and every figure reported by the United Nations were 
correct—as they are not, and by a wide margin—the picture would be 
distorted because the report contains only evidence which the Soviets 
believed it worth while to report. On page 146, the United Nations 
write: 

It was also stated that the gross output of civilian goods increased by 20 
per cent in 1946 (as compared with 1945), while the gross output of the 


entire industry increased by a further 22 per cent in 1947, including a 33 per 
cent increase of the textile and light industries. 


Other official data imply that the decline in armament production in 
1946 (as compared with 1945) overcompensated the increase in output 
of civilian goods.'* Also, the “further” in the above citation is simply 
wrong; there was no further increase in 1947, because the rise in 1947 
merely compensated for the 1946 decline according to Soviet data. 
It is only fair to remark, however, that by holding the Soviet data 
apart, the publication of the Economic Commission for Europe rep- 
resented a great improvement. Selected World Economic Indices, a 
monthly of the Department of Economic Affairs of the United Nations, 
gave ground for hope that this segregation would become a permanent 
feature of the United Nations publications. Unfortunately, the latest 
publication of the same Department,!® Major Economic Changes in 





18 The decline in total industrial output in 1946 as implied in official data was estimated at 18.9 per 
cent by Alexander Gerschenkron (“A Note on Russian Industry in 1947,” The American and East 
European Review, VII, 2, April 1948, p. 138). In the incorrect interpretation by Harry Schwartz (“Soviet 
Postwar Industrial Output,” The Journal of Political Economy, LVI, October 1948, p. 441), the decline 
was estimated as 32.4 per cent. 

19 Lake Success, New York, January 1945. 
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1948, shattered this hope. Indeed, while the publication gives the world 
indices of industrial production also exclusive of the United States, and 
exclusive of Japan and Germany (p. 3), no computation for the world 
exclusive of the U.S.S.R. is presented. 

The compilation on p. 34 of the publication shows that in increase of 
industrial output from 1937 to January-September 1948, the U.S.S.R., 
a large part of whose industry was destroyed during the war, is slightly 
below only Canada and the United States (increases of 65 per cent as 
against 67 per cent for Canada and 69 per cent for the United States). 
The U.S.S.R. is far ahead of Sweden, the neutral country with the 
greatest increase in industrial output over the period (the increase in 
Sweden was equivalent to 43 per cent), not to speak of the uninvaded 
United Kingdom, which can claim an increase of only 9 per cent. 

The Department of Economic Affairs did not state the weights as- 
signed in its computation to the various countries in the base year 
(1937), but if it used the weights worked out by the League of Nations 
in Industrialization and Foreign Trade for 1986-1938, the inclusion of 
the U.S.S.R. raised the index of world industrial production for 
January—September 1948 (1937=100) from 124 to 132. 

The index for the U.S.8.R. in the U.N. study is, of course, based on 
the “unchangeable 1926-27 prices,” which were exaggerating the indus- 
trial output after 1937 even more than they did in 1928-37. No adjust- 
ment was made even for the substantial expansion in territory. An 
analysis of the present writer, based on the comparison of the output 
of industrial goods in physical terms, of transports of freight, of labor 
force, etc., with an adjustment for the changes in territory, indicates 
that the industrial output of the: pre-1939 Soviet territory in January— 
September 1948 was only about equal to that of 1937. The output 
of the present territory in those months is unlikely to have significantly 
exceeded that of the pre-1939 territory in 1937 by more than 10 per 
cent. 

On pages 3-4 of the same publication, the changes from 1947 to 
January-September 1948 are analyzed. The biased Soviet official in- 
dices in terms of “unchangeable 1926-27 prices,” which indicate an in- 
crease in industrial output of 14 per cent in 9 months, are compared 
with the indices of other countries; for example, with that for the 
United States, which point to an increase of only 3 per cent in the same 
period. 

It may be mentioned in this connection that lately the deterioration 
of the “unchangeable 1926-27 prices” reached such proportions that 
they became unbearable even for the unpretentious U.S.8.R. When the 
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Economic Major Changes in 1948 reached the readers, those interested 
knew that the use of those prices in planning was discontinued in the 
U.S.S.R. 


FOOD AND AGRICULTURE OFFICE OF THE UNITED NATIONS 


When reading the F.A.O. evidence on Soviet agriculture, one may 
even wish that the Organization had limited itself to the official Soviet 
pronouncements—even such unbelievable ones as an increase in grain 
crops by 58 per cent in 1947, or the attainment of prewar grain yields 
in that year. 

On page 94 of the F.A.O. report, The State of Food and Agriculture— 
1948, one reads: “The increase of over 4 million hectares per year in the 
arable area in the eastern regions during the war years 1942-44... .” 
A later extension in the area was also claimed (pp. 93-94). No support- 
ing evidence for an increase by some 9 million hectares in arable area of 
that territory can be given. Actually arable land in crops is likely to 
have been less in 1947 than before the war there. 

On the same page 94 of the F.A.O. report it is stated: 

A recent government directive, June 1948, . . . asking farmers for wheat 
and rye yields of 3.2 metric tons per hectare in districts with adequate rain- 


fall, i.e., the Ukraine and Northern Caucasus, and 2.5 tons per hectare in 
the southeastern districts of the country (in 1948 and 1947). 


The F.A.O. report either neglected to mention that the goals were for 
irrigated land (in which case adequate or inadequate rainfall cannot 
have any bearing on yields), or it added the straw and chaff to the 
expected yield of grain proper. 

In the F.A.O. Yearbook of Food and Agriculture Statistics—1947, page 
56, the 1946 sugar output of the Soviet Union is given at 2,385,000 
metric tons. Such easily accessible sources as the 4th Five-Year Plan, 
Andreev’s February 1947 report, and the Gosplan’s yearly and quar- 
terly reports (all published in every Soviet paper) indicate that the 
1946 sugar output could not have exceeded 900,000 metric tons. In con- 
junction with other information such as the statement by Skvotzov, the 
Minister for Technical Crops, in Socialist Agriculture, May 1, 1946, the 
above evidence implies that the 1946 sugar output was below 600,000 
metric tons. The Office of Foreign Agricultural Relations of the United 
States Department of Agriculture is slightly too high with its estimate 
of 700,000 metric tons raw-sugar value. 

Those are merely samples of “statistics” and “evidence” ; space per- 
mitting, they could have been continued almost indefinitely. One 
wonders how it happened that among the numerous and quite sizable 
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errors of the F.A.O. there is difficulty in finding a mistake which would 
minimize rather than exaggerate Soviet activities and achievements. 


CONCLUSION 


The abnormal situation created because the representative of the 
USSR in the United Nations has insisted on coverage of the Soviet 
Union in United Nations economic publications, while no statistics are 
provided, aroused a protest from the editors of The American Statisti- 
cian, a publication of the American Statistical Association.”° But they 
protested only against the withholding of the statistics—a minor defect 
of Soviet statistics. Compliance with the requests of the editors would 
merely lead to even greater flooding of the publications of the interna- 
tional organizations with values at “unchangeable 1926-27 prices,” on- 
the-root crop estimates, and similar “statistics.” 

The grave difficulty of the statistical offices of the international or- 
ganizations is well recognized. One must be extremely reluctant to 
omit a country which comprises a fifth of the world’s territory and 
whose influence, at least at the moment, is even greater. Yet the prac- 
tice of making the publications of the international organizations an 
implement of propaganda can hardly be continued. 

The writer does not pretend to know the solution. Possibly the pub- 
lication of Soviet data, if it is to continue at all, must be strictly limited 
to official figures with citations of the sources in sufficiently large type 
right below the data. Any text, which is not a citation of a strictly 
official information and clearly set up as a citation—with data and 
other paraphernalia—should be excluded. It must be realized that even 
such a reproduction of official data is tantamount to distortion because 
it will be limited to the relatively more favorable topics, on which the 
Soviets are willing to supply information. For example, the Central 
Office of National Economic Accounting in its report for 1948, after 
having given the increases in livestock owned by the collective farms, 
stated: “Livestock which constitutes the personal property of collective 
peasants, workers and employees has likewise increased.” Even careful 
readers have overlooked that the statement implies that livestock of the 
individual peasants, of which there are millions, declined. The damage 
resulting from suppression of certain data might possibly be avoided 
by a clear-cut enumeration of the items, on which the respective or- 
ganization is entitled to receive information, but is not given it by the 
Soviet government. 

The F.A.O., so far as concerns crops, is in a much more favorable 





9011 June 1948. 
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position than the other international organizations. All F.A.O. member 
countries estimate their crops as harvested. A Soviet law prohibits its 
statistical organizations to collect such data; according to law, only 
yields and crops before harvest are ascertained. Hence the F.A.O. is even 
formally free to release private and unofficial estimates of the Soviet 
yields and crops as harvested. It would, of course, be useless for it to 
look for such data in the Soviet sources. Although the Soviet officials 
know the yields and crops as harvested from the reports of the state 
and collective farms, they can only speak of yields and crops before 
harvest. 

The F.A.O. is fortunate to be located at a place where genuine re- 
search on Soviet crops is made. It might weil use the careful estimates 
of the United States Department of Agriculture, or perhaps form a 
special and impartial committee of experts in Soviet agriculture. In this 
way the F.A.O. might easily place itself in the position of being able to 
supply the member countries with a picture which, while not exact, 
would be very much nearer the truth than that implied in the official 
Soviet data. At least the general appraisal and the direction could be 
correct. 

A short time ago, the Soviet Office of National Economic Accounting 
was separated from the Gosplan (State Planning Committee) and put 
directly under the Council of Ministers. To celebrate this event, it as- 
serts in the report on the economic year 1948 that living standards of 
the workers and employees more than doubled in the single year 1948. 
Yet we know that living space increased by about 2 per cent over the 
year, the output of all consumers’ goods is unlikely to have increased 
by more than 20 per cent, and no report of immense money savings by 
the workers and employees are forthcoming. Will the international or- 
ganizations accept the above revelation as statistics and inform the 
world accordingly? 





THE QUANTIFICATION OF QUALITATIVE DATA 
IN DISCRIMINANT ANALYSIS 


PauMER O. JOHNSON 
University of Minnesota 


This paper discusses the extension of discriminatory analysis 
to the case where the primary data are qualitative. The gen- 
eral principle in the use of the discriminant function in the 
case of two classes is to determine a set of adjustable coeffi- 
cients so chosen as to maximize the ratio of the difference be- 
tween sample means to the standard deviation within the two 
classes. When only a single chosen component is to be maxi- 
mized relative to a set of other components, the equations are 
linear. In the case discussed here we have a two-way table of 
non-numerical observational data where the solution of equa- 
tions of higher degree is required. Appropriate values are to 
be determined in order that the observations may be made as 
additive as possible. Application is made to the problem of 
scoring letter grades in school subjects so as to maximize in- 
dividual differences. 


SHER [1] anp Maung [2] have applied the method of discriminant 
Pr acalysis in biological and anthropological studies with qualitative 
data. In education, psychology, and in other fields similar situations 
arise. For instance, individual students have different grades, say, 
A, B, C, D, and F in different school subjects. The conventional method 
for quantifying such data is to assign 3, 2, 1,0, and —1 to the respective 
grades. However, this arbitrary scaling of letter grades does not utilize 
the information which the data have with respect to the values to be 
assigned to the letter grades to make them as additive as possible. 
Since student achievement is most frequently recorded in terms of letter 
grades, it is often necessary to quantify the grades in order that they 
may be advantageously used in further statistical analysis. There are 
other needs for the quantification of qualitative data, such as in the 
scoring of certain achievement and interest measures, in scoring the 
returns from questionnaires, and in utilizing qualitative data for pre- 
dictive purposes. 

Discriminant analysis is used here to obtain appropriate scores for 
the letter-grades, based on the principle of maximizing the ratio of the 
sum of squares due to variation of scores between individual students 
to that due to variation among all school subjects and all individuals, 
or to total sum of squares. 

We give first the theoretical basis for the technique of the discrimi- 
nant function and then present a numerical example. 
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THE DERIVATION OF THE SCORING SYSTEM 


Let us assume a certain number of individuals, each of whom has 
taken a number of school subjects. Each individual receives a letter 
grade say, A, B, C, D, or F for every subject. Let a,;; denote the fre- 
quency of the ith letter grade on the jth school subject for the ith 
individual; a,; denote the frequency of the tth letter grade on all school 
subjects for the ith individual; and a, denote the corresponding fre- 
quency for all the individual observations. 

Let usassumet=1,---,m;i=1,---, pjj=1,---, qi; where mis 
the number of letter grades, p is the number of individuals, and q; is 
the number of school subjects for the ith individual. The appropriate 
score, that is, the value or score assigned to the ¢th letter grade in 
order to make the letter grades as additive as possible, may be denoted , 
by k,. The scores X;;, X;, and X, of the school subject, the individual 
student, and of all the observations may be defined as follows: 


kes; kde: k.a 
(1)  ™ on > thei; ‘ xX; = =. Y 4 = _ ms 
t 


t Ni t ny 
where >> denotes summation over all letter grades, and 
t 


(2) nz = > Qtij, n= 2. any, n= 2. ay. 


Then the total sum of squares of scores for all school subjects and all 
individuals is given by 


1 
Xe p (X;; — X)? = » Be an u Do (keke esjrs3) 
tjr=1,-++,m™ 
1 
7 * Zz Zz (k.k,a,a,). 


tjr=1,---,m 


Similarly, the sum of squares of scores between individuals is given by 


1 
(4) De (Xe— X= Le DD (Rekruiars) — De De (Reker) 


1 
Ns ¢t r 


tijr=1,-++,m tjr=1,--+,m 


and that among school-subjects within individuals by 
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EE tty x= EE 


1 


Nij 


DL (hekrssjrs3) 


? 


. ’ ™ 
1 
— LD 2D (eka). 


tjr=1,-+-,m 


Assigning arbitrary values 0, 41, y2,°**, Ym—2, and 1 for ki, ke, 
ks, - + +, km-1, km, the sum of squares due to variations between indi- 
viduals, and for all school-subjects and all individuals may be expressed 
as functions of y:, - + + , Ym—2. Thus 


z. (X; — X)? = 2 Aw 'y.? + 2 pe Yur 
t u<v 
u=1,---,m—2u,v=1,---,m—2 


+ 2 SY yy (m—1) Yu + A (m1) (m—1)’ = Si 


u= 1, »>m—2 


} ae (Xi; aad Xx)? = u AuSHu* + 2 Zz AuYuYo 


u<v 
u=1,---,m—2u,v=1,---,m—2 
4. 29>) Ace + Agn-t)n-s) = S: 


u=l1,-+-,m—2 


where all the (A’)’s and A’s are the calculated coefficients. 

We wish to explain these notations by an illustration: 

Suppose there are five letter grades (See Table 1) to be scored, then 
the arbitrary values assigned to ki, ke, ks, ks, ks, will be 0, y1, y2, ys, 1. 
Therefore, we have 


u (X; — X)? = An’/y? + Aoe’y2? + Aas’ys? + 2Ar2’yiys 


(8) + 2Ai3’Y1¥s + 2A23’Yy2¥s + 2A’ 
2Ax'Y2 + 2ZAu’ys + Au’ 
ye Ds (Xiz — X)? = Any? + Acoye? + Assys? + 2A wyiYe 


a 
(9) + 2Arsyiys + 2Acsyoys + 2ZAuy 
+ 2ZAnye + 2ZAuys + Au. 
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Specify 6 as the ratio of (6) to (7). The maximum value of @ is given 
by the largest root of the following equation of (m—1)-th degree: 


A'n - An@ A'i(m-1) _ Ai(m—1)9 
(10) A'n —A210 A!s(m—1) — Aagmn—1)9 
A! m—1y1 — Aqm—in9 = A’ mys — Agno + + * A’ (m—1ym—1) — A n—1 (m—19 


There are many methods of solving a polynomial of higher degree 
for 6. Fisher [1, pp. 292-294] used the method of divided differences. 
In his illustration beginning with a four-rowed determinant the ele- 
ments of which were linear functions of the unknown 6, the value of the 
determinant was calculated for selected values of 6. Then by interpola- 
tion using divided differences, the largest value of 6 was found which 
makes the determinant zero. 

Since @ is essentially a real and positive quantity, m—1 positive roots 
exist in equation (10). The appropriate values y:, y2, - - * , Ym—2 are ob- 
tained from the following set of simultaneous equations 


re) 
— (S; = 6S2) = 0 
Oy 


(Si “ 6S2) = 0 
OY (m—2) 
by substituting the maximum value of @ since the scores sought are 
those that make the variance ratio a maximum. 
In our special case where all individuals take the same school-subjects 
and the number of subjects is g, we have 


(12) G4; = 1or0; nj=1; n= p; n= pg. 


Therefore, 


1 
>» kiae:, =— bo kia. 
t 


(13) Xi = Do kan, Xi = 
: Pq + 


1 
P 


Then the total sum of squares of scores for all school-subjects and all 
individuals is given by 
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DL db (Xi — X)? = p> 2 {kPars;*} 
eo oF a0 


= =} > kPa? + 2 >> kaka} 


PQs ¢ t<r 
tjr=1,-++,m 


The sum of squares between individuals is given by 


pe (X; — X)? = - >} pi k,fa,;? + 2 p> kiksautesd 


‘ t<r 
t,r=1,---,m 


” —4 > kZa? +2 >> kikyarayh 


Pq t t<r 
tjr=1,-+++,m 


The sum of squares among school-subjects and within individuals is 
given by 


LL Ky - X= L x{ x kta 


ne 


1 
_ > x} bs kPa;;? + 2 p kaka} . 
Y t 


t<r 


tjr=1,--+,m™ 


As a final step, we should set up an analysis of variance table for 
testing the hypothesis of no difference between individuals as follows: 


ANALYSIS OF VARIANCE TABLE 








Degrees of Sums of 


Source of Variation Freedom Squares 





Remainder 1-6 


Between individuals 














Total 
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TABLE 1 


THE CREDIT-LETTER GRADES OF 40 ENGINEERING STUDENTS 
IN FOUR COMMON SUBJECTS 








Subjects 





Mathe- Cromistry 4 English4 Drawing 1 


matics 11 , = 
(5 Credits) (4Credits) (3Credite) (3 Credits) 


> 
w 
Q 
~] 





a 


5D 4C 3C 3C 
5D 4A 38C 3B 
5B 4B 3c 3B 
5D 4C 3C 3B 
5A 4A 3B 3C 
5B 4B 3B 3B 
5D 4C 3B 3B 
5C 4c 3C 3C 
5C 4C 3C 3B 
5B 4B 3A 3B 
5A 3C 3B 
5F 3C 3B 
b5F 3D 3D 
5D 3B 3D 
5B 3B 3C 
5A 3B 3B 
5D 3D 3C 
5C 3D 3B 
5F 3F 3D 
5B 3A 3A 
5D 3C 3B 
5D 3B 3B 
5A 
6D 
5C 
5C 
5C 
5B 
6D 
5B 
5B 
5C 
5B 


— 


OOVA SH ROH 

Decanwwnwo 
_— 
DaRmowraws 


— 


os 
coco wooococowovqoowoooauanonrnewnmoomoneoocoocoaeoorsoan 


— 


cee 


SMWHOONN BOOT ASAGRwWWowWwWowoowoaeanwWwWoww 
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— 
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NUMERICAL EXAMPLE: SCORING LETTER GRADES 


For an illustration of the quantification of qualitative data by means 
of the discriminant function we present the letter grades of 40 engineer- 
ing students who took the same four school-subjects carrying 15 quarter 
credits, taken at the University of Minnesota. We wish to determine 
the score value to be assigned each of the letter grades in order to 
differentiate most sharply among the individual students. The primary 
data are presented in Table 1. 

Let us arbitrarily assign the value 0 to grade F, and the value 1 to 
grade A; the values corresponding to the grades D, C, and B are given 
the algebraic values y:, y2, and y3. Then by substituting the appropri- 
ate values calculated from Table 1 in equations (6) and (7) we obtain: 
Sum of squares for “between” individuals: 


14796y:? + 27564y2? + 25079y,? + 11676 — 10824y.y2 
— 17156y1y; — 10552y, — 25948y2y; — 10536y2 + 836ys. 
The sum of squares for “total”: 
49196y:? + 82604y.? + 80199y3* + 33356 — 41944y1y2 
— 39396y:ys — 12152y, — 86028y2y, — 26536y2 — 24924y;. 


These terms include a multiplier of 600, introduced in the clearing of 
fractions. 


(17) 


(18) 


The values (17) and (18) can be expressed as in Tables 2 and 3. To 


TABLE 2 
MATRIX FOR “BETWEEN INDIVIDUALS” (GRADES) 








Yi Y2 Ys 1 





14,796 — 5,412 
—5 ,412 27 ,564 —5 ,268 
—8 ,578 —12,974 418 
—5 ,276 — 5,268 








TABLE 3 
MATRIX FOR “TOTAL” (GRADES) 








1 Y2 Y3 
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find the values of y:, y2, and ys, which will make the ratio of the matrix 
for “between” individuals to that for “total” as large as possible, it is 
necessary to solve an equation of the fourth degree: 


14796 —491966 —5412+209728 -—8578+196988 -—5276 +60760 
—5412+209720 27564—826040 —12974+430140 —5268 +132680 
—8578+196988 —12974+430140 25079-—801999  418+124620 as 

—5276+60760 —5268+132680 418+124626 11676 —333560 


(19) 


It is not necessary to calculate the coefficients of (19). It is usually 
more convenient to evaluate the determinant exactly for chosen values 
of 6, and to apply the method of divided differences to solve for the 
required roots. 

Table 4 shows the values obtained at six chosen values of 6, simplified 
by dividing by 1,600,000,000. The second column in Table 4 is found by 
dividing the successive differences of the first column by 0.2, the inter- 
val between successive values of 6; the third column is likewise found 
from the second, the divisor in this case being the difference between the 
values of @ separated by two steps which in this table is constantly 0.4. 
Since, for any expression of the fourth degree, the fourth divided dif- 
ference is constant, accuracy of the values is checked in the last column, 
if enough values of @ are used. 

It is apparent that the root required lies between .4 and .6. Since the 
fourth difference is constant, whether the intervals are equal or not, 
positive or negative, the equation may be solved by choosing successive 
values of @ to continue the table so as to make the determinant ap- 
proach zero. Thus in calculating the value for @=.5 a new line is added 
in which the third divided difference is increased by the fourth differ- 
ence multiplied by .1, the multiplier being simply the new value less 
the value in the table four steps back. The new third difference is then 
multiplied by —.1, the difference in 0, taken three steps back and added 
to the second difference. The multiplier by which the new second differ- 
ence is multiplied is —.3, since the new value of @ is .3 less than that 
used two steps back. Finally, the new first difference is multiplied by 
— .5 and added to the value of the determinant at 1.0 to find its value at 
.5. In Table 5 this line has been filled in with exact values. In the subse- 
quent lines sufficient figures have been obtained by the same procedures. 

The value of 6 so obtained is actually the fraction of the total sum of 
squares which is ascribable to “between” individuals, when this fraction 
is maximized. To obtain the corresponding score-values 41, y2, and ys, 
we substitute the value of @ into equation (11) and obtain the following 
equations: 
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— 12939.3187y:+ 6411.4227y2+2527.1774y;= 1850.5224 
(20) 6411.4227y, — 19005.8078y2+ 11276.0813y3= — 2212.1246 
2527.1774y:+ 11276.0813y2— 20134.9366y3= — 7443.7245 


The solution of (20) gives 
yi = .3887315 
Yy2 = .741624 
y; = .833628 


These are the score-values, of which only the first two figures need be 
used, appropriate to the letter-grades D, C, and B, respectively, if zero 
is assigned to F and unity to A for the purpose of differentiating as 
sharply among individuals as the data permit. However, before it can 
be stated whether or not the scores are effective, a test of significance 
must be made. 

A test of significance of rows and columns may be made directly from 
the value of @ without evaluating the scores, since only the ratio of the 
sums of squares is needed. An approximate test of significance is pro- 
vided by adding 3 degrees of freedom for the 3 unknown adjusted, to 
the 53 for rows and columns, and subtracting 3 from the degrees of 
freedom for remainder. The analysis is recorded in Table 6. 


TABLE 6 


ANALYSIS OF VARIANCE OF THE NON-NUMERICAL 
CONTINGENCY TABLE 








Source of Variation 


Degrees 
of 
Freedom 


Sums of 
Square 





Rows and columns 
Remainder 


56 
543 


-43623 
.56377 





Total 


599 


1.00000 

















The variance-ratio F'» exceeds the 0.1 per cent point and the differ- 
ences between different rows and columns are therefore highly signifi- 
cant. It may be inferred therefore that significant differences exist in 
the letter-grades, or among individuals. It is only under such conditions, 
of course, that the scores are of any use. 

The score values obtained from the empirical data are subject to 
sampling errors. The question of the precision of the score values ascer- 
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tained presents some peculiar features. The discriminant function is 
unchanged if all of its coefficients are increased or decreased in propor- 
tion. Accordingly, no standard error can be calculated for each coeffi- 
cient singly. The question relevant to the coefficients of the discrimi- 
nant function may be expressed comprehensively in terms of a test of 
significance as to whether any alternative assigned system of scores is 
significantly contradicted by the data. 


TABLE 7 
SETS OF VALUES FOR THE ARBITRARILY ASSIGNED a 








Rows and Columns Totals 





—42 ,866 —76 ,146 
—10 ,278 —37 ,878 
42 ,383 85 ,023 
32,146 63 ,426 








TABLE 8 


ANALYSIS OF COVARIANCE FOR ARBITRARY AND 
EMPIRICAL SCORES 








Source of Variation > (a*) > (eX) > (xX?) 


Rows and columns 192,311 43 ,252 .599 10 ,590 .033 
Remainder 164 ,560 33 , 467 .233 8,194.169 

















Total 356 ,871 76 ,719 .832 18 ,784 .202 





We could carry out the test in our problem by comparing the ob- 
served values derived from the data with any proposed system of 
values, a. Thus, for example, retaining the score-value zero for the 
letter grade F, we shall give values 1, 2, 3, and 4 to the letter-grades 
D, C, B, and A, respectively. 

The test of significance between the coefficients thus arbitrarily 
chosen and those evaluated empirically is carried out below. 

From the data as given in Tables 2 and 3, we form a new column by 
multiplying the four columns—y:, yz, yz, and 1—by 1, 2, 3, and 4 and 
adding. This gives the two sets of values in Table 7. 

If the four rows (Table 7) are multiplied by 1, 2, 3, and 4 and added, 
an analysis of variance is obtained for the assigned set of values a; 
likewise, if multiplied by the system pf score values derived from the 
data, say X, an analysis of covariance is obtained for X and a. Simi- 
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TABLE 9 


ANALYSIS OF VARIANCE OF EMPIRICAL SCORES 
ELIMINATING ARBITRARY SCORES 








! 
nae Sum of Mean 


Source of Variation 


Preodem Squares Square 





Rows and columns 55 903 .234 16.42 
Remainder 543 1,387 .802 2.56 





Total 598 2,291 .036 

















larly, the analysis of variance for X is found. These calculations give 
the numerical values recorded in Table 8. 

Now we proceed to eliminate a by the general procedure, that is, by 
subtracting from > -(X*) the square of }>(aX) divided by > (c*), 
making use of the lines for remainder and total, and subtracting to ob- 
tain the values for rows and columns. The results are given in Table 9. 
The number of degrees of freedom has been reduced for rows and 
columns, since, after eliminating a, two values only are adjustable. The 
value of Fy exceeds the 0.1 per cent point. It is thus noted that the 
arbitrarily determined system of linear score-values is significantly con- 
tradicted by the data. In this way the precision of the scores ascer- 
tained from the data is established. While in this case the variance- 
ratio test is only approximate it is sufficiently precise to make the dif- 
ferentiation between the set of score values found from the data and 
the set of values arbitrarily assigned. 
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ON.THE CHOICE OF THE NUMBER AND WIDTH OF 
CLASSES FOR THE CHI-SQUARE TEST OF 
GOODNESS OF FIT* 


C. Arraur WILLIAMS, JR. 
Columbia University 


This article describes in non-mathematical fashion the tech- 
nique suggested by H. B. Mann and A. Wald for selecting the 
number and width of class intervals for the chi-square test of 
goodness of fit when the null hypothesis distribution is con- 
tinuous and completely specified. The number of ~’ isses is 
selected by means of a formula depending upon the sample size 
and the level of significance and the class limits are chosen 
such that each class contains the same number of items under 
the null hypothesis. Finally it ie suggested that the number of 
classes as given by the formule may be halved for practical 
purposes, 


N MOST statistical problems the distribution of the universe from 

which a sample has been drawn is unknown. To test whether or not 
this sample was drawn from a population having a specified distribu- 
tion, statisticians commonly employ the chi-square test of goodness of 
fit. 

In order to carry out this test, one first sets up a null hypothesis 
stating that the sample was drawn from a universe with a known dis- 
tribution. If the parameters are based on standards, theory, or past 
experience, the distribution is completely specified. If the parameters 
are estimated from the sample, only the type of the distribution is 
specified. Next one computes 


(1) 


(fs — Np)? 
Nps 


where f; is the actual or observed number of frequencies in the ith class, 
p; the probability under the null hypothesis that an observation will 
fall into the ith class, N the number of observations in the sample, and 
k the number of classes. It can be shown that as the size of the sample 
approaches infinity the distribution of this statistic approaches the chi- 
square distribution with k—1—s degrees of freedom where s is the 
number of parameters estimated from the sample. In practice the chi- 
square distribution is assumed to hold for finite values of N and one 
ascertains the value of x*:.(a) such that the probability of x? being 


* This article is based on a Master’s Essay written at Columbia University under Professor T. W. 
Anderson, Jr. 
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greater than or equal to x*,_:_.(a) is equal to a, the level of significance 
or probability of rejecting the null hypothesis when it is true. If the 
computed value of x? is equal to or exceeds x%_1_,(a), the null hy- 
pothesis is rejected. If the computed value is less than x*_1-,(a), one 
accepts the null hypothesis. 

Despite its wide use this test has some serious limitations. Firstly 
there are many distributions which will give the same theoretical class 
frequencies as the null hypothesis distribution. It is quite possible that 
we may accept a hypothesis stating that a particular sample is drawn 
from a normal population say, when in fact it belongs to an alternative 
population which gives the same theoretical class frequencies. If the 
roles of the null hypothesis distribution and an alternative distribution 
of this type were reversed, the new null hypothesis would also be ac- 
cepted, the computed value of x? being the same as in the first case. 
Such a null hypothesis distribution and an alternative distribution are 
shown in Diagram 1. Secondly, it is possible to choose the number and 
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Null hypothesis distribution 


YX alternative distribution 








CLASS 
DIAGRAM 1 


width of class intervals for the test in many different ways, some of 
which may change the result of the test. This is demonstrated in Dia- 
gram 2. Let the dashed line represent the null hypothesis distribution 
and the blocks the observed frequencies. If in conducting the test we 
used classes I, II, III, and IV, we would reject the null hypothesis. If 
we used classes I’ and II’, we would accept the null hypothesis. Thus 
it is clear that as long as this subjective element remains we always run 
the risk of influencing our results by the choice of the class interval. 
Thirdly, we know nothing about the power of the test, the probability 
of rejecting the null hypothesis when it is false. 
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In the September 1942 issue of the Annals of Mathematical Statistics 
there appeared an article by H. B. Mann and A. Wald of Columbia 
University entitled “On the Choice of the Number of Intervals in the 
Application of the Chi-Square Test.” In their article Mann and Wald 
proposed a solution to this problem in the case where the population 
parameters are not estimated from the sample but are based on stand- 
ards, theory, or past experience. It will be the purpose of this article to 
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point out the implications of the aforementioned article for the practi- 
cal statistician and to suggest a reduction in the number of classes. 


I. HOW TO USE THE MANN-WALD TECHNIQUE 


The following mechanical procedure is necessary to carry out the 
suggestions of Mann and Wald: 
1) Set up a null hypothesis stating that the sample was drawn from 
@ universe with a completely specified probability distribution. 
2) Compute the number of classes to be used by means of the follow- 


ing formula: 
2(N — 1)? 
al 0 doer 
ce 


where k is the number of classes, N is the number of items in the 
sample, and c is obtained from a table of areas under the normal 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1950 


curve such that 


-) 1 . 
J — >= er dy = a, 
e W2r 


the level of significance. A sample calculation follows: 

Let us suppose that a sample of 1000 items is drawn and the 
test is to be conducted using the 5 per cent level of significance. 
The formula for k may be rewritten as follows: 


, 1 2(N — 1)?\" 
k = E antilog (= log —) |. 
5 oe 


From a table of areas under a normal curve we find that for the 


TABLE 1 


NUMBER OF CLASSES (k) AND DISTANCE (4) FOR THE 5% LEVEL OF SIGNIFI- 
CANCE AND FOR SELECTED VALUES OF N BETWEEN 200 AND 2000 








N At 





200 . 1605 
250 - 1469 
300 . 1343 
350 - 1284 
400 .1213 
450 -1157 
500 -1112 
550 . 1052 
600 . 1024 
650 - 1000 
700 -0961 
750 -0945 
800 .0914 
850 .0887 
900 .0877 
950 -0855 
1000 .0834 
1100 .0812 
1200 .0782 
1300 .0757 
1400 .0734 
1500 .0715 
2000 .0629 





* Values of k were obtained by taking the greatest integer contained in the value given by the 
formula. 

t Values of A are those for which the corresponding integer k gives the maximum power for the 
“worst distribution.” 
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TABLE II 


NUMBER OF CLASSES (k) AND DISTANCE (A) FOR THE 1% LEVEL OF SIGNIFI- 
CANCE AND FOR SELECTED VALUES OF N BETWEEN 200 AND 2000 








N A* 








* See notes to Table I. 


5 per cent level of significance c = 1.64. Substituting the values for 
N and c in the above formula, we find that 
1 2(1000 — 1)2 
k = E antilog (= log | 
5 (1.64)? 





k = [4 antilog (4 log 742,008.08) | 
k = [4 antilog (1.174104) | 
k = [4(14.9)] = [59.6] = 59. 


Since this procedure is somewhat laborious, tables of N and k 
(Tables I and II) are provided for the 5 and 1 per cent levels of 
significance in this article. These tables are not complete since 
they deal only with selected values of N between 200 and 2000 
but for practical purposes simple interpolation will give the de- 
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sired number of classes for any sample size within this range. Also 
included in the tables are values of A or “distance,” a term which 
will be introduced later. 
Choose the class limits such that the number of theoretical fre- 
quencies in each class is equal to N/k. Notice that in this pro- 
cedure the class limits and not the class frequencies vary from one 
class to the next. 

) From the data determine the actual number of observations fall- 
ing in each of the classes. 
Compute 


(2) dse 


This formula is obtained by substituting p;=1/k in formula (1) 
and simplifying it. 

Determine x%,-:1(@) using a chi-square table. There are k—1 de- 
grees of freedom and the level of significance is a. Use the critical 
region x?=x*,_1(a) to test the null hypothesis. 


II. ADVANTAGES AND LIMITATIONS OF THE MANN-WALD TECHNIQUE 


At first glance this procedure seems much more complicated han 
the ordinary one of selecting equal class intervals along the horizontal 
axis and it is reasonable to ask what advantages are to be obtained by 
using this suggested technique. Before discussing these advantages, 
however, it is necessary to define two terms which will be used through- 
out the following discussion. 

1) The cumulative distribution function (cdf) is defined as the prob- 
ability that a random variable X be less than or equal to a given 
value, say “a”. For example, consider the familiar bell-shaped 
normal distribution. If we were to graph this as an ogive letting 
the vertical axis denote the percentage of total items having a 
value equal to or less than the value stated on the horizontal axis, 
we would have a cdf for the normal distribution. We could do the 
same for each probability distribution—binomial, Poisson, etc.— 
and in this discussion we will be referring to the cdf when we 
speak of a probability distribution. A typical cdf is shown in 
Diagram 3. 

Consider two such cdf’s. For each value of “a” along the horizon- 
tal scale there is a numerical difference between the two cdf’s 
along the vertical scale. The absolute value of the greatest such 
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DIAGRAM 3 


numerical distance will be defined as the distance between the 
two cdf’s. It is this distance that we will refer to when we speak of 
the power of the test. To clarify this point see Diagram 4. 

It should be noted at this time that the Mann-Wald theory is an 
asymptotic theory and that it has been rigorously proven only for 
sample sizes greater than or equal to 450 and the 5 per cent level of 
significance, and for sample sizes greater than or equal to 300 and the 
1 per cent level of significance. However the authors state their belief 
that the results hold approximately for sample sizes as low as 200 and 


may be true for considerably smaller samples. With this qualification 
and the aid of the definitions given, the advantages of the test may now 
be considered. 

1) By obtaining k from a formula or table and by choosing the class 
limits such that there are N/k theoretical frequencies in each 
class, the subjective element is removed from the choice of the 
number and width of the classes. 


ve 
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2) The maximum distance of those cdf’s which have the same class 
frequencies as the null hypothesis cdf is minimized. In other 
words, the Mann-Wald technique does not eliminate the first 
limitation to the chi-square test but it does minimize the maxi- 
mum distance of such alternative distributions. 

This follows from the fact that the maximum distance between 

such cdf’s and the null hypothesis cdf is equal to the maximum 
class probability since by definition the ogive or cdf must rise or 
remain at the same level as we move from left to right along the 
horizontal axis and two cdf’s having the same class frequencies 
must intersect above the upper limit of each class. We minimize 
this maximum class probability by setting all the class probabili- 
ties equal for if we choose the class probabilities in any other way 
there will be at least one class probability greater than 1/k. 
The power of the test for those cdf’s whose distance from the null 
hypothesis distribution is greater than or equal to A as given in 
Tables I and II is for practical purposes greater than or equal to 
one-half. In other words, the probability of rejecting the null 
hypothesis when the universe actually has a cdf a distance A or 
greater away from the null hypothesis cdf is equal to one-half or 
more. We can make no such statement for the ordinary test. 

This statement and the two which follow require a mathemati- 
cal proof which can be found in the article by Mann and Wald. 
If a number of classes different from the one given by the formula 
is used to conduct the test, there will be at least one cdf whose 
distance from the null hypothesis cdf is greater than or equal to A 
such that the power of the test for that cdf is less than one-half. 
The choice of equal class probabilities gives us an unbiased test 
in the sense that when the class frequencies in the universe are not 
equal, the probability of accepting the null hypothesis is less than 
when they are equal. In other words, when the null hypothesis is 
true, we have a larger chance of accepting it than when it is not 
true. 

6) There is no need to worry about having more than five theoretical 
frequencies in each class for this condition is automatically ful- 
filled when the test is applicable. 

Like all other tests, this one also has its limitations and these must 

now be considered. 

1) As has already been explained, the theory is an asymptotic one. 
It has been proven rigorously only for large samples. 

2) The procedure is more complicated than the ordinary one and the 
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choice of class limits which make the class probabilities equal is 
time consuming, especially since the number of classes given by 
the formula is quite high. However some time is saved since it is 
not necessary to compute theoretical frequencies as can be seen by 
noting formula (2). 

The class intervals used for the test are not suitable for visual 
presentation and another grouping must be made for that pur- 
pose. 

In order to conduct the test, ungrouped data are required since it 
is necessary to compute the actual class frequencies using class 
limits other than those given by an ordinary frequency distribu- 
tion. Furthermore, in most cases the unclassified data must have 
as many or more significant figures than the class limits in order 
to decide into which class an observation falls. For a given sample 
size, as the range of the data increases, this becomes less impor- 
tant. 

The power of the test for those distributions whose distance from 
the null hypothesis is less than A is not known. 

Even more serious in this regard is the question as to whether 
this “distance” is a useful criterion. It may be more valuable to 
talk about the power of the test for those cdf’s which are similar 
in some other respect. For example, area between the alternative 


cdf’s and the null hypothesis cdf may be more important than 
distance. 

The most serious limitation is the fact that the parameters are 
assumed to be known and the distribution under the null hy- 
pothesis must be continuous. 


III. EFFECT OF A REDUCTION IN THE NUMBER OF CLASSES 


All that has been said heretofore applies to the case where we use the 
number of classes specified by the formula. Two interesting problems 
to consider are 1) the effect on the distance when the power is required 
to be one-half or greater but a smaller number of classes is desired and 
2) the effect. on the power when the distance is held constant and a 
smaller number of classes is used. 

Such a study has been made with the results listed below. These 
results apply to the “worst alternative distribution”—the alternative 
distribution with respect to which the power of the test is a minimum. 

1) For a given distance, the power of the test is reduced a relatively 

small amount by cutting k in half, the reduction of the power be- 
coming smaller as N increases. For example, when N = 1000 and 
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A=.083, the power drops from .50 to .40 when the number of 
classes is reduced from 59 to 30. 

When the power of the worst alternative distribution is equal to 
one-half, the distance increases slightly when k is cut in half, the 
increase becoming smaller as N increases. For example, when 
N = 1000, the distance increases from .083 to .089 when the num- 
ber of classes is cut from 59 to 30. 

3) The power function of the test with respect to this worst alterna- 
tive distribution is very steep, the power decreasing quite rapidly 
as the distance decreases. 

These resultz indicate that for practical purposes, the number of 

classes suggested by the formula may be cut in half with a relatively 
small effect on the power or the distance. 


IV. SUMMARY 


If the chi-square test of goodness of fit is to be used with ungrouped 
data, a large number of observations, a continuous distribution and 
known parameters, the Mann-Wald method may be used. This pro- 
cedure is the usual one with the exception that the number of classes is 


given by the formula 
2(N — 1)? 
sl Ss ase aie | 
c? 


and the class limits are chosen such that the theoretical class proba- 
bilities equal N/k. This technique enables one to state that the power 
of the test for a family of cumulative distribution functions a distance 
A or greater away from the null hypothesis cdf is for practical purposes 
greater than or equal to one-half. If it is not required that the power of 
the test be one-half or more, the statistician may cut the number of 
classes in half for large sample sizes without greatly affecting the power. 
If the power of the test is required to be one-half but for a class with a 
distance greater than A away from the null hypothesis distribution, the 
number of classes may be halved without greatly increasing the dis- 
tance. 





SIMPLIFIED PROCEDURES FOR FITTING A 
GOMPERTZ CURVE AND A MODIFIED 
EXPONENTIAL CURVE 


Jack SHERMAN 
AND 


WINIFRED J. MORRISON 
The Texas Company Research Laboratories, Beacon, New York 


This paper describes simplified methods for fitting a Gom- 
pertz curve and a modified exponential curve. These methods, 
together with the one described by Spurr and Arnold! for 
fitting a logistic curve, are useful in determining which type 
of growth curve is most appropriate for a given set of data. 


INTRODUCTION 


puRR and Arnold! have recently published a short-cut method of 

fitting a logistic (Pearl-Reed) curve by means of a nomograph 
and a grid. The nomograph determines the upper limit from three se- 
lected points, and the grid reduces the curve to a straight line when 
the ordinates are plotted as percentages of the upper limit. 

As Spurr and Arnold point out, the logistic equation is merely one 
of several empirical approaches to the “law of growth.” It would be 
especially convenient to have simplified methods worked out for fitting 
other growth curves, so that a given set of data could be plotted on 
various grids, and the most appropriate curve selected. 

The present paper describes simplified methods for fitting the 
Gompertz curve and the modified exponential curve. 


GOMPERTZ CURVE 
Writing the Gompertz equation in the form 
Y = k(a)*, 0<a<10<b<1, (1) 


it is seen that k signifies the upper limit asymptotically approached by 
Y as Xo, 
Equation (1) may be rewritten in the form? 


1 1 
In In — = InIn— + (In D)X, (2) 
p a 





1 William A. Spurr and David R. Arnold, Journal of the American Statistical Association, 43, 127- 
134 (1948). 

2 1n denotes natural logarithm. An equation similar in form to Eq. (2) holds for ordinary loga- 
rithms. 
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in which 
p= Y/k (3) 


denotes Y values expressed as fractions of the upper limit k. It is evi- 
dent that if a grid is constructed whose ordinates are a linear function 
of In In 1/p and whose abscissas are proportional to X, a set of X and 
Y values related by Gompertz Equation (1) will plot linearly on the 
grid if the Y values are expressed as percentages of the upper limit k. 
In order to use such a grid, it is therefore necessary first to evaluate k. 
A nomograph for doing this is described in the following section. 


NOMOGRAPH FOR FINDING UPPER LIMIT OF GOMPERTZ CURVE 


For three ordinates, say Yo, Yi, and Y2 corresponding to equally 
spaced values of X, it may be readily verified from Equation (1) that 


In Yo In Y2 a In? Y; 
In Yo + In Y2 — In Y;? 


A nomograph based on Equation (4) is shown in Figure 1. It was con- 
structed by reducing Equation (4) to the form 


D? — 2DF + EF = 0, (5) 


Ink = (4) 





in which 
D = In (¥1/Yo) (6) 
E = In (¥2/Y) (7) 
and 
F = In (k/Y). (8) 
Equation (5) can be expressed as the third order determinant 


D 1 


0 1 








2F 0 1 


Multiplying this determinant by a matrix of transformation to obtain 
a suitable nomographic determinant,’ the equations for the coordinates 
of the scales (in arbitrary linear units) turn out to be 





*Cf. F. T. Mavis, “The Construction of Nomographic Charts,” Scranton, Pa. International Text- 
book Co. 1939. 
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FIGURE 2 
LINEAR TRANSFORMATION OF GOMPERTZ CURVE 








ee" F-rrreo 8 FS eee segos so zz 


CURVE FITTING 91 


Yr =4F -—6.5 (10) 


y, - 47 6.88 dis 
~ E-38_ . S=an 


1.5 2D* — 6.5D + 2 


D-19 c D-19 
The nomograph shown in Figure 1 is used to find the upper limit k 
of a Gompertz curve from Yo, Yi, and Y~2 (corresponding to equally 
spaced values of X) by first forming the ratios Y:/Yo and Y2/Yo, then 
placing a straight-edge across the left and center scales corresponding 
to the values of these ratios, reading the ratio k/Yo from the intersec- 
tion of the straight-edge with the right-hand scale, and finally multi- 
plying (k/Yo) by Yo. 


GRID FOR LINEAR REDUCTION OF GOMPERTZ CURVE 


The grid shown in Figure 2 was constructed on the basis of the 
equation 





(12) 


1 1 1 1 
In In -—-InIn— = (in In —InIn -) — (In b)X- (13) 
0.002 p 0.002 a 
Equation (13) follows directly from Equation (2). 

The ordinates of the grid are proportional to the values of (In In 
1/0.002—In In 1/p) and the abcissas are equally spaced. 


EVALUATION OF PARAMETERS IN GOMPERTZ EQUATION 


For a given line on the grid, the remaining constants of the Gompertz 
equation, namely a and b, may be easily found. From Equation (13), 
it is seen that for two values of p, and the corresponding X-values, 


1 1 1 1 
In In — InIn— — (in In — InIn -) 
0.002 0.002 


Inb= — : — ll (14) 
X2— Xi 

Equation (14) is convenient for determining b from two points on the 
line. The linear scale on the left-hand side of the grid gives the values 
of (In In 1/.002—In In 1/p). To obtain —In b, it is only necessary to 
compute the ratio of the difference between the two values of (In In 
1/.002—In In 1/p) to the difference between the corresponding X- 
values. 

When the value of b has been determined, a may be found from 
Equation (13) and the grid. 
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Rewriting Equation (13) 


1 1 1 1 
(in in —— a in In —) = (in n —— - in In —) 
.002 a .002 Pp 


+ (In b)X (15) 


Equation (15) may be used to find (In In 1/.002—In In 1/a) from one 
value of p and its associated X. 

If the value of (In In 1/.002—In In 1/a) is within the range of the 
linear scale on the left hand side of the grid, the corresponding value 
of 100a is immediately obtained from the ordinate of the grid, as 
shown on the right-hand scale. However, if (In In 1/.002—lIn In 1/a) 
is outside the range of the linear scale, a table of natural logarithms 
(or exponentials) must be used to obtain a. 


MODIFIED EXPONENTIAL CURVE 
The modified exponential growth curve may be written in the form 
Y =k + AB*, A<0,0<B<1. (16) 


In this equation, * signifies the asymptotic upper limit of Y as X->~, 
Equation (16) may be rewritten in the form 


] . ) . (17) 
2 (— A 


in which p denotes Y/k. 

Equation (17) may be used as a basis for constructing a grid which 
will result in a linear plot of Y/k vs. X. However, just as for the case 
of the Gompertz curve, it is first necessary to evaluate k. This can be 
conveniently done from three selected points by means of the nomo- 
graph described in the following section. 


NOMOGRAPH FOR FINDING UPPER LIMIT 


For three ordinates corresponding to equally spaced values of X, 
it follows from Equation (16) that 


Y;? — YoY: 
"-Ye- hs 
Figure 3 shows a nomograph based on Equation (18). It was con- 
structed by converting Equation (18) to the form 
A?—C—2A0+6+ C6 =0, (19) 


k= 





(18) 


in which 
A= Y,/Yo, (20) 
C = Y2/Yo, (21) 





CURVE FITTING 


and 

6= k/Yo. 
Equation (19) can be expressed in determinantal form as 

(A — 1)? 1 
2A-—-1 2A-—1 
9 i ' , (23) 

C 

6—1 0 1 


Multiplying this determinant by a matrix of transformation to 
obtain a suitable nomographic determinant,‘ the equations for the 
coordinates of the scales (in arbitrary linear units) are 











X,=0, | Yy = 0.250 — 2.5 (24) 


6 7.5 — 2.25C 
eee a eee “ 
ER 
6 A A? — 20A + 40 

24-9 8A — 36 

The nomograph shown in Figure 3 is used to find the upper limit k 
of the modified exponential curve in precisely the same manner as 
previously described for the Gompertz curve. A straight-edge is placed 
across the left-hand and middle scales corresponding to values of the 
ratios Yi/Yo and Y2/Yo respectively. The corresponding value of 
k/Yo is read from the intersection of the right-hand scale with the 
straight-edge. 





Xa = 





(26) 


GRID FOR LINEAR REDUCTION OF MODIFIED EXPONENTIAL CURVE 


The grids shown in Figures 4 and 5 were constructed from the 
equations 


1 1 1 1 
in ( ) - in = [in (- +) -in ] 
l-—p 0.998 A 0.998 
— (In B)X. (27) 


Figure 4 covers values of p from 0.2 per cent to 99.8 per cent, and is 
useful for plotting data in the upper region of the growth curve. Figure 
5 is an enlargement of the lower part of Figure 4, that is from 0.2 to 
80 per cent. It is more convenient for plotting data in the lower and 
middle regions of the growth curve. 





‘Cf. F. T. Mavis, loc. cit. 
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Y=k+AB* 


FIGURE 4 


LINEAR TRANSFORMATION OF MODIFED EXPONENTIAL CURVE 
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Y=k+AB* wren unr 


FIGURE 5 
LINEAR TRANSFORMATION OF MODIFED EXPONENTIAL CURVE 
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EVALUATION OF PARAMETERS OF MODIFIED EXPONENTIAL EQUATION 


For a given line on the grid, constants A and B of the modified ex- 
ponential equation may be found in a manner analogous to that of 
the Gompertz equation. From Equation (27), it follows that 


1 1 1 1 
(in —In <5) ~ (in —In =.) 
1 — De .998 1— p .998 


In B Xo X, (28) 
The linear scale shown on the left-hand side of the grids contained in 
Figures 4 and 5 gives values of [In 1/(1—p)—In 1/.998]. Hence 
—In B may be calculated from two points on a line by taking ratio 
of the difference between their [In 1/(1—p) —In 1/.998] values to the 
difference between their corresponding X-values. 
It follows from Equation (27) that 


1 1 1 1 
ee a a 
=) 2—,| (in — na) + dn = oo 





If the numerical value of [In (—1/A)—In 1/.998] is on the left-hand 
linear scales of Figures (4) and (5), the corresponding value of (100— 
100A) is given by the ordinate shown on the right-hand scale. If the 
value of [In (—1/A)—In 1/.998] is outside the range of the linear scale, 


A may be obtained with the aid of tables of natural logarithms. 


CONCLUDING REMARKS 


Although the methods described in the foregoing paragraphs are 
convenient for determining values of the parameters of the Gompertz 
and modified exponential equations, it should be emphasized that the 
parameters so obtained are not necessarily the “best” in the least 
squares sense. This is a result of using three “selected” points to 
determine k. Obviously, a different selection of the three points would 
yield a somewhat different value of k, and hence of the other two param- 
eters. 

The chief advantage of the grids and nomographs is to show which 
type of growth curve is most appropriate for a given set of data. 
However, the parameters evaluated for the most appropriate equation 
by the methods of this article or by that described by Spurr and 
Arnold’ can be used as initial approximations to those obtained by 
the methods of least squares.® 


5 Loc. cit. 
* W. E. Deming, Statistical Adjustment of Data, John Wiley and Sons, inc., New York, 1943. 








BIAS DUE TO NON-AVAILABILITY IN 
SAMPLING SURVEYS* 


Z. W. Brrnpaum 
AND 
Monror G. SIRKEN 


A technique is presented for the treatment of errors intro- 
duced into sampling surveys due to the non-availability of 
respondents. The expected cost and variance of the sample 
survey are expressed as functions of sample size and of the 
number of call-backs made on the non-availables. A method is 
then presented which optimizes precision for a given cost by 
playing sampling error against the bias resulting from non- 
availables. 


I, INTRODUCTION 


AMPLING ERROR is merely one of the possible errors in sampling sur- 
~ veys. In addition there are other types of errors resulting from 
numerous sources, such as inadequately trained interviewers, ambigu- 
ous definitions, non-availability of respondents, etc. These latter errors 
are introduced in implementing any survey—with or without sampling. 
In this paper non-availability of respondents is singled out for special 


consideration, and a technique for treatment of the “non-availables” in 
sampling surveys is presented. 

During the past decade a mass of data has been accumulated which 
clearly show the possibility of bias introduced when inferring the dis- 
tribution of an attribute or attitude in a population from only that part 
of a randomly selected sample which was more easily available for 
questioning [1], [4], [6], [7], [9], [10], [12]. Indeed, the magnitude of the 
bias resulting from incomplete interviewing of the randomly selected 
sample in many instances dwarfs the sampling error by comparison. 
The relative magnitude of the sampling error to the bias suggested to 
some researchers the efficacy of reducing the size of the sample and di- 
verting the saved money and effort to a more complete coverage of the 
sample [9], [12]. They offer some evidence that reduction in sample 
size and increased concentration on the non-availables in sample sur- 
veys may, under certain conditions, result not only in greater precision 
but also in reduced cost./ As far as the present authors know, the 
mathematics relating cost to precision in terms of both sampling error 
and bias due to non-availables has not been presented to date. One 



































* Research under the sponsorship of the Office of Naval Research. Presented to the Institute of 
Mathematical Statistics, November 27, 1948. 
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possible approach to the problem has been indicated by Hansen and 
Hurwitz [5] who applied Neyman’s double sampling theory [8] to a 
subsample of the non-availables, but they were not directly concerned 
with the non-availability bias. 

In the present paper, the expected cost and variance of a sample sur- 
vey are expressed as functions of sample size and the number of call- 
backs made on the non-availables. Although reference is made to call- 
backs on non-availables by personal interview, generalization of 
call-backs to coverage of the sample by other means such as telephone 
or mailed questionnaire is obvious. A technique is then presented which 
optimizes precision for given cost by playing sampling error against the 
bias resulting from non-availables. 


II. BIAS DUE TO “NON-AVAILABLES” 


For the purpose of this presentation, we consider a population + and 
a definite procedure § for attempting to make individuals from 7 avail- 
able for interviewing; we assume that an individual, once he is availa- 
ble, will respond; that the interview consists of one cuestion; and that 
the response to this question can only be “yes” or no.” 

Assuming that procedure § is applied to all individuals in 7, we in- 
troduce the notations: 


N:u:= Number of individuals in + who are available and respond 
“yes,” 

Nio= Number of individuals in + who are available and respond 
“no,” 

Na=Number of individuals in x who are not available and re- 
spond “yes,” 

Noo=Number of individuals in x who are not available and re- 
spond “no,” 

N:.=Nu+Nio= Number of individuals in + who are available, 

No.=Noo+No= Number of individuals in x who are not available, 

N =N,.+No.=Number of all individuals in 7, 


Mi. 
Pr. =—=Fraction of those available in 7, 
N 


Po. =—=Fraction of those not available m 7; 
N 


Pe: NutNa 


Pa = Fraction of those responding “yes” in 7, 
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_Nw+Noo 


V = Fraction of those responding “no” in 7, 


P.0 


pu=——= Fraction of those available and responding “yes” inz, 


po=——= Fraction of those available and responding “no” in 7, 


N 
pu=——= Fraction of those not available and responding “yes” 


in 7, 
Pu=—= Fraction of those not available and responding “no” 
in 7, 
Nu pu 


— =—-= Fraction of those responding “yes” among those 
1 Fr. 


, 


available in 7, 


a = Fraction of those responding “yes” among those 


not available in x. 


The parameter desired in a survey is p.1, i.e. the proportion of indi- 
viduals in population 7 responding “yes.” However, for various reasons 
some individuals in population z cannot be made available for ques- 
tioning by the procedure f, and the best one can do is determine the 
proportion p’ of individuals having opinion “yes” among those availa- 
ble. The difference 


Loo — 9: 


will be referred to as the “bias due to non-availability.” It is of interest 
to see how large this bias can be. We have 


b =p’ — pi =p’ — pu — Pu = D’ — p'Kn. — p''Do. 
= p'(1 — pi.) — p’'po. = p'po. — p''po. = po.(p’ — p’’). 


This formula expresses the bias in terms of the difference between the 
proportion responding “yes” among the availables, and the proportion 
that would respond “yes” among the non-availables. If no information 





BIAS IN SAMPLING SURVEYS 

is given for p’’ then 

(2.0) Osp"”s1 

and we obtain the lower and upper bounds for b 

(2.1) m(po., p’) = — po.(l — p’) $b S pop’ = M(po., p’). 


It can be seen from (2.1) that the range for the bias is equal to 
M—m= po, and is independent of p’. However, for given po., the lower 
bound in (2.1) is absolutely largest for p’=0, and the upper bound for 
p'=1, so that 


max | m(po., p’)| =| m(Po., 0)| = Pe. 
max M(po., p’) = M(po., 1) = po. 


min (| m|, M) =| m(po, #)| = M(po, #) = == - 


Hence, if it is desired to assure |b| small, it is most advantageous to 
use a question for which p’ is close to 3. 

Clearly (2.1) can be improved if there is some information available 
which makes it possible to narrow down the range for p’’ given in (2.0). 


III. A BIASED STATISTIC, ITS EXPECTED VALUE AND VARIANCE 


Let O, be a random sample of size n from the population 7, and let 
the procedure [3 be applied to all individuals of O,. We introduce the 
notations: 


n= Number of individuals in O,, 

n,,= Number of individuals in O, who are available, 

no, = Number of individuals in O, who are not available, 

y= Number of individuals in O, who are available and respond 
“ ” 

yes, 

19 = Number of individuals in O, who are available and respond 
“no,” 

no = Number of individuals in O, who are not available and re- 
spond “yes,” 

Noo = Number of individuals in O, who are not available and re- 
spond “no.” 


We have 


n=, + No, ™m, = Mo + Nu, No. = No + Noo. 
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Of these frequencies, only n, 71., No, Mu, and mo can be actually ob- 
tained. Since no, remains unknown, there appears to be no way of 
estimating the parameter 


> Nu + Na 
Pa ~—“_ 


unless some additional information is available. The statistic usually 
computed is 


(3.0) 


Since we have! 


(3.1) —s BG m.)| - z|— E(nu| m) | 
=F ry mp" | =p’, 





the statistic U is an unbiased estimate of the parameter p’. In Part II 
we have discussed the bias introduced by using p’ in place of p.:. 
The population-size N will, from now on, be assumed so large in 
comparison with the sample size n, that the random selections of the 
different individuals of O, may be considered independent. 
For the variance of U we have 


o°(U) = E(U*) — E(U) 


wu ~ #[() ]-#L2GSl*)) 


¢ 
= E —, E(m:?| m)| 


L721. 


where 





and 
E[n?| m. | = o?(na| m.) + E*(nu| m.) = m,p'(1 — p’) + (m.p’)? 
so that 


E(U) = p+ pil — pe (—), 


Ny. 





1 The notation used is: E(X| Y) =conditional expectation of the random variable X for given Y. 
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and finally 


1 
(3.2) o*(U) = p'(l - )E(—). 

n. 
Since a survey will certainly be disregarded if not a single individual 
has been made available for interviewing, we may consider 7, as a 
positive Bernoulli variable with zero-value excluded. Under this as- 
sumption E(1/m,) can be evaluated by use of Stephan’s [11] formulae 
(16), (18), (19), (25) and (26.2). If n is solarge and po. =1— 7.80 small 
that 


NpPi.Po.” 
1 — po.” 


k = 


is negligible, formula (16) in [11] may be used with t=3 and it is obvious 
that ye, us and E(Rs) are negligible so that a very close approximate 
value for E(1/n.) is given by 


1 1 2 
(3.3) z(—) =——__ . 
m™, (n + 1)pr. 


Hence, if k is sufficiently small, we have with a very good approxima- 
tion 


(3.4) o°(U) = p’(1 — p’) 


(n+ 1)p. 


This formula may be given an intuitive interpretation: o?(U) is equal 
to the variance of the relative frequency of the number of those answer- 
ing “yes” in a sample of size (n-+1)p:. taken from the subpopulation of 
those in x who are available. 

At this place it may be appropriate to comment on the frequent 
practice of substituting for those individuals in O, who are not availa- 
ble other individuals selected at random from the population x who are 
available. By this practice, one finally obtains a sample of size n, in 
which all individuals are available. Such substitutions result only in 
increasing the sample from the N;, available individuals, but the final 
sample still does not include any of the No, individuals in x who are not 
available. These substitutions, therefore, reduce the sampling error of 
the estimate for p’, but do not help correct for the bias p’—p.1. 





2 It may be worth noting that this approximate value remains practically unchanged if m. is taken 
a8 @ positive Bernoulli variable truncated at a much higher value than 0, as long as npu, is large. This 
has been verified for np:, 250. The practical implication is that the variance o2(U) is not appreciably 
affected even if the survey director is willing to discard samples for which m:, SL where L is considerably 
greater than sero. 
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IV. SAMPLE-SIZE FOR GIVEN PRECISION 
The total error due to using U as an estimate for p.: is 


H=U-—-pi=U-p'+p’—pai=S+5b 


where S= U—E(U) is the sampling error and b the bias. We wish to 
determine the sample size n so that there will be a high probability, say 
at least 1—a where a is a small positive number, of obtaining a total 
error H which will be absolutely less than a preassigned small positive 
number 6. In symbols this would mean an n determined so that the 
inequality 


(4.1) P(| H| $8) 21-a 


is fulfilled. We will say that such an n assures the given “precision” 6 
at the “probability level” a. 

In the Appendix it is shown that an integer n satisfying the in- 
equality 


T.? 
n= —1 
45(1 — po.)(5 — po.) 


will assure precision 6 at probability level a; T. is obtained from the 
tables of the normal probability integral so that 


2 Te 
— | e~"/2qt =l-a. 
V2rd o 
The smallest integer value of n satisfying (4.2) will be denoted by 
n(, a, Po.)- 

It should be pointed out that n(é, a, po.) is not the smallest value of 
n which can be obtained for a given precision and probability level. A 
method to obtain a smaller value is suggested at the end of the Ap- 
pendix; it requires, however, laborious numerical computations. 





(4.2) 


V. SAMPLING WITH CALL-BACKS AND ITS COST 


We shall now describe a procedure § which may be denoted by $: 
and which consists in pre-assigning a number k, the “number of call- 
backs,” and in making consecutive calls on an individual up to k times 
in order to make him available for interviewing. If an individual has 
been made available for an interview at the jth call, where 137 Sk, no 
further calls are made and he is counted as “available”; if he is not 
available for an interview in all k calls, he is considered “not available.” 
We shall use the following notations: 
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P,.? = Probability of an individual being non-available in the 1st, 
2nd, --- (j—1)st call and being available on the jth call 
G=1---k), 

po.) =1—)-3_,p1.9 =Probability of an individual being non- 
available within k calls, 

n;,‘) = Number of those in sample O, available exactly at the jth 
call, 

c= Total cost of field operations of the sample survey, 
8;= Total cost due to an individual available exactly at the jth 
call including cost of the interview (j=1---k), 
«= Total cost due to an individual non-available up to and in- 
cluding the kth call. 


The total cost of the field operations of the sample survey is then: 


k k 
c= } > mB; + ¥: (n _ > m,‘0) 
j=l j=l 


(5.0) s 
= >> (6; — ve)m. + nv 
jul 


and since E(m,“) =np,.9, 7=1, 2, + - + , k, the expected cost of asam- 
ple survey with sample size n and k call-backs is 


k 
(5.1) Eve,n)(c) = nf > (Bs — ve)p1.6 + |. 
j=l 


Consider the joint distribution of (m. - - +m. +--+ -+m,). This 
is a multinomial probability distribution with the probabilities 


1D + MD. py, 
The variances and covariances are® 


(5.2) o2(m.) = np, O(1 — pr.) 
(5.3) oan = — npr. Op, 


With these expressions for the variances and covariances we have 


k 
o%(c) = >) (8; — ve)*npr.“(1 — pr.) 
jul 


— 
— n>) Dd) (8; — ve)(Bi — ve) 1.1. 


j=l lal 
Iysj 


(5.4) 
3 See e.g., Cramér [2] p. 318. 
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k 
nm >> (8; — ve)? 
j=l 


k k 
—n)> > (8; — ve) (Bi — va) 1.1. 


j=l lel 
k k 2 
mn D2 (8; — ve)? — nl Dd (8; — wp. | , 
j=l j=l 
and hence 
- E(c) ' 
(65) ot) = a[ D@s— n)'9. - (=> - n) |. 
j=l n 
For given precision 6 and probability level a, we are now able to de- 
termine k and n so that 
P(| H|] $8) 21-4 


and that E(c) is as small as possible. From (4.2) we obtained the 
smallest value of n in terms of 6, a and po.. For our procedure f; we 
have 


k 
Po. = po.) = 1 — . » n., 


jul 


n(6, Qa, Po.) ” n(é, Qa, Do.‘*). 


Substituting this value for n in (5.1), we obtain 
k 

(68) Bw(0) = m0, , on.) YG) — wo.” +n), 
jel 


an expression which, for given 6, a, depends only on k. To minimize the 
expected cost of a sample survey with call-backs, the value k= which 
minimizes (5.6) must be determined. This can be done by trial in the 
following manner: 

First one determines the smallest integral value k’ of k so that 


k? 


(5.7) > nm. 21-5, 
j=l 


and computes Eq(c) from (5.6). Then one computes Eq(c) for all 
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values k>k’ for which the constants p,., 8; (j=1, 2, ---, hk) and v 
are empirically known, and chooses for & the value of k minimizing 
Ew(c). Once & is obtained, n(5, a, po.) is determined by (4.2) for 
k=k, and the variance of the cost is obtained by (5.5). 

The preceding discussion was confined to the planning stage of the 
survey, when an estimate of the cost is desired but an estimate of p’ is 
not yet obtained. Consequently, the most pessimistic value of p’=p 
given by (A.42) was used. With the completion of the survey and with 
an estimate U=ny/m, of p’ available, a revised improved determina- 
tion of 6 and (1—a) is possible from (A.4). 


VI. A NUMERICAL EXAMPLE 


The following values of p:,.“? are based on actual data.‘ The cost- 
coefficients 8;, yx, though somewhat arbitrary, were chosen to conform 
with experience of the Washington Public Opinion Laboratory. 


TABLE I 








pn. 


B; 


Yk 





65 
.24 
.07 
.02 
.O1 





$0.98 
1.34 
1.78 
2.36 
3.03 











$0.23 
0.59 
1.03 
1.61 
2.28 








These empirical values were used to obtain the following tables for the 
probability level a=.05. 


TABLE II 


MINIMUM SAMPLE SIZE n(8, .05, p*’) FOR PRECISION 4, 
PROBABILITY LEVEL a=.05, AND k CALL-BACKS 








15 


10 


05 


.03 





179 
60 
50 
47 








166 
122 
108 





2, 


000 
653 
486 





3 , 267 
1,617 





9,702 








4 Unpublished data from Survey of Temporary Housing Projects, Housing Authority of the City 


of Seattle, placed at our disposal by courtesy of Edith Dyer Rainboth, consultant to the survey. 
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TABLE III 


EXPECTED COST Ey%)(c) FOR PRECISION 8, PROBABILITY 


LEVEL a=.05, AND k CALL-BACKS 








10 


.05 





$183.21 
67 .46 
58.13 
55.62 


$186.65 
141.84 
127 .82 


$2,248.80 
759.18 
575.18 


$3 ,797 .05 
1,913.72 


$11,482.31 




















TABLE IV 


STANDARD DEVIATION e(c) OF COST ¢ FOR PRECISION 4, 
PROBABILITY LEVEL a=.05, AND k CALL-BACKS 








.05 .03 





$2.87 
1.80 
2.10 
2.49 


$10.36 
7.19 
7.99 


$2.99 
3.25 
3.77 


$15.52 
14.58 








$35.71 














From the preceding numerical example we conclude that, for the 
probabilities p,.“? and cost coefficients 8;, y., used, and the probability 
level a=.05, each additional call-back reduces the expected total cost 
for precisions 6=.15, .10, .05, .03, .02. Our values of 7.7, B;, yx were 
estimated on hand of available empirical data and their properties may 
be expected to be typical. It remains possible, however, that, for some 
other values of these constants, consecutive increases of the number 
of calJ-backs will not always result in a reduction of the total expected 
cost. Any definite statement on this matter can be made only after the 
constants p;.“, B;, vy. are evaluated for the particular survey organiza- 
tion and tables similar to our Tables II, III, IV are computed. 


APPENDIX 
We wish to determine the sample size n so that 
(A.1) P(| H| $8) 21-a. 
We have 
P(| H| S$ 6) =P(-b-6 SSS —b+8) 
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and this is, for n large, approximately equal’ to 
1 (—6+8) /o r ee 
(A.11) Pa(p’, b) = —= | Se-Hl2dt, where os = fale ad ) 
Var (—b-8)/og (n ao 1)p.. 
For any fixed p’, it follows from (2.1) that 


min P,(p’, b) = min {P,(p’, —po.(1 — p’)), Pa(p’, po.p’)}. 
(b) 





Since p’ is not known, the only available information on 6 is the in- 
equality 
(A.2) | b] S po. 


which follows from (2.1). The bias component of H alone may, there- 
fore, be as large as po., and hence the prescribed precision 5 must be 
greater 


(A.3) 5 > po.. 
If p’=.5, we have 
’ ; : ; 1 (poe't)leg 
(A.4) o OP Ree FES one — 
= on(p’). 
In view of (A.3) we have 
05 5— pop’ S5+ pop’ 
and hence 
2 O—pop')leg 
(At) a(R Bf at = Yul. 
For .5S p’S1, the function y,(p’) has its minimum for 
(A.42) p= sintlotone =p 
25 — Po. 
which in dow of (A.3) is between .5 and 1. It follows that 





2 2v (n+1)71.V5 (6—Po, ) . 
(45) Valo’) B J e-Hidt = ln, po) 


for 1>p’=.5. 





5 This follows, for example, from Theorem 1 in Doob [3]. 
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By a similar argument (A.5) can be proven for 0S p’S.5, and hence 
the inequality (A.5) is true for OSp’S1. 
Making use of the approximation (A.11), we conclude 


(A.6) P(| H| < 6) = Pa(p’, b) = on(p’) = valp’) = u(n, po). 


Since 
lim p(n, po.) = 1, 


mh— oo 


it is always possible to find, for given 6 and po., a value 7 such that 
h(n, Po.) = 1 — &. 
In particular, if T, is such that 


2 Te 
— f e-“idt =1l—a 
V2 Jo 
we see from (A.6) that 
T.* 
(A.7) n= -1 
45(1 — po.)(6 — po.) 

is sufficient for (A.1), that is for assuring precision 6 at the probability 
level 1—a. We denote the smallest integer fulfilling (A.7) by n(6, a, po.). 

It should be pointed out that, generally, a value of n smaller than 
n(é, a, po.) could be obtained by finding 


min n(p’) = y(n, Po.) 
0Sp’S1 





and determining the least n such that »(n,po.)=1—a. This seems to be 
feasible only by means of laborious tabulations. 
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CORRECTION TO “ON THE BEST CHOICE OF SAMPLE SIZES 
FOR A t-TEST WHEN THE RATIO OF VARIANCES IS KNOWN” 


JOHN E. WALsH 


In the paper “On the Best Choice of Sample Sizes for a ¢-Test 
When the Ratio of Variances is Known,” (Journal of the American 
Statistical Association, Volume 44, 1949, pp. 554-558), the criterion 
(2) of page 555 should read 


(0/n+1/m) 





1—K2/2(n+m-—2) 


rather than 


(0/n+1/m) [1—K2/2(n+m—2) ]. 





AN APPLICATION OF SCALING TO 
QUESTIONNAIRE CONSTRUCTION* 


Cuar.es A. METZNER 
Survey Research Center, University of Michigan 


The basic problem considered was that of selecting one out 
of a number of proposed question forms in survey work. In 
the particular case, there were two apparent directions of 
bias. Questions were designed tc represent each of these, and 
each question of the set was judged in relation to these and to 
each other. Thurstone scaling of judgments was used to indi- 
cate which question was judged to lie centrally with respect to 
the set. The method is applicable to further work on question 
construction. 


HE PROBLEM motivating this study was that of choosing one ques- 

tion form, out of a number proposed to meet a particular interview 
survey objective, that would convey to respondents the exact meaning 
intended. The method by which it was attempted to obtain an ap- 
proximation to this was an application of the scaling of judgments. The 
method was used as a practical technique for the detection of bias, so 
that a question could be selected which might be presumed to be rela- 
tively free of suggestion. 

Alternative methods of formulating a question are usually considered 
in designing a questionnaire. The methods for deciding which of the 
forms is to be used are neither objective nor quantitative. They are, 
certainly, often empirical, in that discussion and criticism of the ques- 
tions by the group responsible for the survey make available the ex- 
perience of the group. Many pitfalls may be avoided by this. However, 
none of this experience is very wide nor very well formulated, so that 
extension to new problems is fraught with danger that cannot be 
assessed. Often experts recognize this by pretesting several alternative 
formulations to determine in a rough way which obtains the most in- 
formation relative to the objective, with least problems of understand- 
ing or rapport and least bias. In crucial cases, or to obtain more definite 
knowledge, several forms may be incorporated in the survey design in 
such a manner that comparisons may be made of the information de- 
rived from each. If a sufficient number of these studies were conducted 
and it were possible to relate them, we would know enough about word 





* This work was done under the general direction of Mr. Howard G. Brunsman, then Assistant 
Chief, now Chief, Population Division, Bureau of the Census, with the assistance and encouragement 
of Mr. E. Everett Ashley, 3rd, and Mr. William Bloom of the National Housing Agency (now the 
Office of the Administrator of the Housing and Home Finance Agency). 
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usage and meaning so that problems of question construction could be 
handled scientifically. As it is, however, semantic theory has too small 
an experimental base to allow application to detailed problems, and we 
must proceed on a more narrowly empirical basis. 

Specifically, the particular problem arose during 1946 when the 
Bureau of the Census initiated a series of sample surveys in various 
communities throughout the United States to ascertain the housing 
intentions of returned veterans of World War II as a part of its work 
for the National Housing Agency. Among the many objectives in this 
complex area was that of determining the price level at which a veteran 
would enter the housing market. A similar question was to be asked of 
the veteran who wanted to buy or build, as of the one who wanted to 
rent. Preliminary discussion had indicated that in either case the price 
reported would fluctuate in accordance with stress on the price that 
could be paid or on the price the respondent desired to pay. Since there 
were many other objectives of the survey, it was decided to use only 
one question rather than explore all of the variables. That one question 
should obtain a realistic value that was neither pessimistically high nor 
wishfully low. A variety of questions was suggested to meet this objec- 
tive, a number of which are given below in the chart. Time pressure 
made necessary a question choice with a minimum of pre-testing. One 
form was selected for use, but the discussion had revealed a disparity of 
viewpoints indicating the value of further work. 

The proposed solution to this kind of problem was to scale a set of 
questions relating to the objective. The set of questions in this case 
varied from those tending toward a desired level of expenditure to those 
suggesting an amount that might possibly be paid, and it was postu- 
lated that a question near the middle of the set when scaled along the 
continuum from desirable to possible would represent a value that 
would actually be paid. 

The research was conducted at low cost with emphasis on practical 
results rather than on a design to evaluate semantic variables. Basic 
validation to actual housing economic behavior was not feasible. There 
were three questions concerning the proposal: Was the technique prac- 
ticable with respect to time, cost, and acceptance by people making the 
necessary judgments? Were the results internally consistent so that 
the scaling might be considered valid? Did the method give results 
congruent with known differences of wording and consequent meaning? 

The method was kept relatively simple. Ten questions were selected 
for study. Eight of these had been discussed as possible forms for use, 
one of them being similar to a question used in a Fortune housing 
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survey.'! Two questions were designed to be biased in the alternative 
directions of maximum and minimum amount. These were to serve 
as end-points for a simple validation of the method. The questions as 
presented, which are listed in Chart I below, did not include a refer- 
ence to rent or to purchase price, either of which would have been highly 
repetitive. This reference was made in the instructions. 

Thurstone scaling was used, both Case III and Case V of his applica- 
tions of the “law of comparative judgment” being employed. The ques- 
tions were presented in full paired-comparison form so that a detailed 
analysis could be made. Questions were systematically permuted within 
and between the forty-five pairs to minimize position effects. Three 
pairs were repeated at the end of the list as an additional test of relia- 
bility, making a list of 48 items for judgment. 

One of the incidental problems was whether or not people without a 
technical interest would accept the task of making the necessary com- 
parisons of questions, which is an unusual type of judgment. The in- 
structions read: 

“It would help us in designing new questionnaires if we could have your 
judgment of these possible ways of stating a question. In the parentheses 
after each pair of questions, please write the letter (A or B) of the ques- 
tion form that seems to you to suggest paying more money, as rent for a 
house or apartment, for example. There are no right or wrong answers, of 
course. Please answer each one, even if they seem pretty much alike and 


you have to make a wild guess as to which seems to represent paying 
more money.” 


No sampling of people was even contemplated for this test of the 
method, although it certainly would be necessary if any exact infer- 
ences were to be drawn. Two groups of clerical employees, totaling 100 
people, were chosen in the Bureau of the Census (from the Population 
and Geography Divisions) with wide variation in age, schooling, and 
geographical origin. While this variation does not make them repre- 
sentative of any particular population, the presence of this variation 
would allow the influence of these variables to be indicated by corre- 
sponding variance of the data. 

The method of scaling used was devised by L. L. Thurstone.? It is 
based on a generalization of the relationships found in psychological 
work on sensation between the objective measures of some stimulus 
and the subjective judgments of the relationship between values of the 





1 The Fortune Survey, Fortune, Vol. 33, No. 4, April, 1946, p. 266. 

2 Further information concerning the procedure may be obtained from: Thurstone, L. L., and 
Chave, E. J., The Measurement of Attitude. Chicago: University of Chicago Press, 1929; or Guilford, 
J. P., Psychometric Methods. New York: McGraw-Hill, 1936. 
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stimulus. Evidence concerning the basic assumptions may, however, be 
obtained directly from judgments of other stimuli. As a statistical tech- 
nique the scaling is similar to the probit transformation used in bio- 
assay for response curves. 

If a set of responses (such as judgments) is associated with a stimu- 
lus in such a way that members of the set have probabilities described 
by the normal distribution, the mean value of this distribution of re- 
sponses (called the discriminal dispersion) will correspond to the scale 
value of the stimulus, and the standard deviation of the responses may 
be taken as the unit of the scale. If two stimuli are being compared (as 
greater, or heavier, or the more conservative of two opinions), the pro- 
portion that judgments of “greater” are of all comparative judgments 
will be a function of the scale separation of the stimuli and of the stand- 
ard deviations of the discriminal dispersions involved. The relation be- 
tween these values is called the “law of comparative judgment” by 
Thurstone. It may be represented by 


Sp ~ Se = XvaV op? + Te" —- 2rove 


where: S, and S; are the scale values of the stimuli. 





Xw is the normal deviate corresponding to the proportion of 
judgments that one stimulus is the greater. 

o» and o, are the standard deviations of the discriminal dis- 
persions. 

r is the correlation between discriminal deviations of the re- 
sponses during judgment. 


Several assumptions must be made to apply the law. (1) It is as- 
sumed that a group of people, each judging once, will give a distribution 
of responses like that obtained for successive judgments by one indi- 
vidual. (2) It is assumed that during judgment deviations are uncorre- 
lated (r=0). (3) We may assume equality of the variances of discrimi- 
nal dispersions. 

Since, in the method of paired comparisons, each stimulus is com- 
pared with each other stimulus, we have one direct estimate of the scale 
separation between two stimuli, and n—2 others of the form (S,—Sz) 
—(S,—S;,). Variation among these may be used to estimate o, and o to 
eliminate assumption (3) above. This form is Case III of the five cases 
of application discussed by Thurstone. When all three assumptions are 
made, the law reduces to S,—Sp=XtavV/2, if oa=o0o= * + * on is used as 
the scale unit. This is Case V. Scale differences under the Case V as- 
sumptions are computed by substituting in the formula: S,—S,= 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1980 


Questions 


—=—— What is the absolute maximum you could possibly pay? (2.49) 


How much can you afford to pay? (1.69) 
ne 
How much are you able to pay? (1.67) 


What is the most you would be willing to (1.64) (¥) 
What would you be able to pay? (1.54 547 (0) me 
What do you think you could pay? (1. A 


What are you able to pay? (1.28) 


What would be a reasonable amount for you to pay? (0.86) 


How much do you think you should pay? (0.51) 





What would you like to pay? (0.00) 


CHART I 
SCALE VALUES OF QUESTIONS ON RENTAL ABILITY 


(./2/n)( >> Xax— >> Xvx), summation being over k=1, 2, - - +, n. Scale 
values are then determined by arbitrarily assigning zero to one (usually 
the lowest) and adding the differences. 

The results are presented in Chart I, which lists the questions and 
their scale values computed according to Thurstone’s Case V. It may 
be noted first of all that the questions designed as end-points are the 
ones that fall on the extremes of the scale. 

The questions similar to those used by the Bureau of the Census and 
the Fortune Survey, which are designated in the chart by a C and an F 
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respectively, both fall near the center of the scale, and may be con- 
sidered unbiased with respect to the extremes of the continuum. In this 
case, the choice of question at the Census Bureau was confirmed. 

There are a number of results which would not have been predicted, 
such as the difference in scale values between “How much are you able 
to pay?”, 1.67, and “What are you able to pay?”, 1.28. The inference 
from this is borne out by other items, to indicate at once the danger of 
a non-empirical approach and the importance of a factorial design to 
provide information on the semantic values of various words along this 
continuum. This latter could, with validation, form the basis of scien- 
tific question construction. 

The difference in scale positions for two items may be used to esti- 
mate the proportion of times one would be judged greater than the 
other. These estimates do not necessarily agree with the observed pro- 
portions, and the differences are a measure of the consistency of judg- 
ments with the assumptions of the scale, or of how well the scale actu- 
ally represents the data. The number of degrees of freedom appropriate 
to the comparison is difficult to assess, so that little more can be said 
than that the agreement is reasonable. The average difference was 
about two percentage points. 

The Case III analysis indicated that the estimated standard devia- 
tion of responses to a given item (o;) was uniformly greater for items 
having scale positions either higher or lower than about 1.6. This again 
is difficult to interpret, although it is interesting that the only question 
to depart from this pattern is the “Fortune” question, which may de- 
rive its central scale position from the contrary suggestiveness of the 
terms “most” and “willing,” with some consequent increase in varia- 
bility of response. This is frankly speculative, and presented to indicate 
a direction for further research. 

There was no indication, either from comments or inconsistencies of 
data, that any of the people making the judgments thought the idea 
silly or otherwise not worth doing seriously. The three pairs of questions 
which were repeated at the end of the questionnaire gave proportions of 
judgments which agreed within three percentage points with the corre- 
sponding proportions from the body of the questionnaire. 

Another readily appreciated indication of reliability derives from the 
fact that scale values determined from the data of a third group of 14 
people from the National Housing Agency correlated 0.97 with those 
given above. 

In conclusion, it may be said that the results warrant further trial of 
the procedure. It was assumed in this study that a question with a cen- 
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tral scale position would be relatively free of the two directions of bias 
which were believed to represent unrealistic extremes. Bias is, of course, 
relative to the aim of a study, so that if one wishes to obtain data 
relative to what is, from another point of view, an extreme position, 
the problem still remains the same—that of determining which ques- 
tion will obtain what level of answer. 

The scale values of the two questions presumed here to indicate dif- 
ferent directions of bias were in accord with these presuppositions. 
Whether the position of other items accurately reflected the differences 
to be expected in answers obtained could be determined only by valida- 
tion against more basic criteria. Questions which had been independ- 
ently considered to meet the objective did receive middle scale positions. 
Also, the scale values did seem accurately to reflect the judgments 
made, in the sense that the differences in the scale values absorbed most 
of the variation in judgments. Adequate statistical tests of scales in 
terms of their efficiency in representing sets of judgments would be 
desirable. 

From the point of view of summarization alone, scales are valuable. 
The contribution of scaling to question construction depends on de- 
termining what is only suggested here—that scaling of judgments 
orders questions with respect to answers to be expected. This would 
provide a practicable basis of quantitative semantic analysis, necessary 
to make question construction accurate and objective. 





A NOTE ON P. K. WHELPTON’S CALCULATION OF 
PARITY ADJUSTED REPRODUCTION RATES* 


P. H. KARMEL 
Trinity College, Cambridge 


Reproduction rates calculated from specific fertility rates 
controlled for parity and age, but not fully controlled for 
nuptiality may be more sensitive to fluctuations in marriage 
rates than the conventional reproduction rates. Consideration 
of this factor is necessary before drawing conclusions concern- 
ing changes in fertility. 


Rr. WHELPTON’S Calculation of reproduction rates from specific 

fertility rates controlled for parity as well as for age is an ex- 
tremely valuable advance in the development of measures of fertility 
and reproductivity. Nevertheless, it seems necessary to issue a word 
of caution in the use of these rates when they are not fully controlled 
for nuptiality. Dr. Whelpton himself on page 514 points out that it 
would be best to calculate what is in effect a nuptiality table and to 
“relate first births to ever married (rather than total) fecund zero 
parity women by age in the base population” and then to apply these 
rates to the nuptiality table. In actual practice, insufficient data are 
available to do this, so in his calculations Dr. Whelpton has used zero 
parity age specific fertility rates based on all rather than on married 
women, adjusted in a rather arbitrary fashion for fecundity and 
spinsterhood. 

As a result of this, the zero parity age specific fertility rates which he 
uses will be sensitive to changes in marriage rates, and may well tend 
to exaggerate fluctuations in reproduction rates due to fluctuations in 
marriage rates even more than the simple age adjusted reproduction 
rates. This can be seen by taking a simple case. Let us suppose that 
fertility conditions are defined by the probabilities of married women 
aged x of parity n bearing a child of order n+1 and mortality and 
nuptiality conditions are defined in the usual way. Further, suppose 
that the distribution of the actual population by age, marital status 
and parity is the same as that of the hypothetical population implied 
in the fertility, mortality and nuptiality conditions. If for some reason 
there is a temporary rise in marriage rates, i.e., some marriages are 
brought forward, then, in the year immediately following the rise, 
with fertility conditions unchanged, there will be an increase in first 
births, of, say, a fraction k. In that year a reproduction rate controlled 





*P. K. Whelpton, “Reproduction Rates Adjusted for Age, Parity, Feoundity and Marriage,” 
Journal of the American Statistical Aesociation, vol. 41, 1946, p. 501. 
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for marital status and duration of marriage, calculated on the basis of 
standard nuptiality conditions will show no rise, since fertility condi- 
tions have, ex hypothesi, remained the same. This is quite correct be- 
cause fertility is unchanged and the rise in the marriage rates only 
temporary. A simple age adjusted reproduction rate will show some 
rise, but since first births are only a part of all births the rise will be less 
than k. Dr. Whelpton’s zero parity age specific fertility rates will rise 
by k, since these are calculated on the basis of all females, the number 
of whom, ex hypothesit, remains the same and not on the basis of 
married females, the number of whom has been increased by the rise 
in the marriage rates. Accordingly, the age parity adjusted net repro- 
duction rate will rise, since the zero parity specific fertility rates which 
contribute to it have risen. Hence a temporary increase in marriage 
rates will cause a rise in both the simple age adjusted net reproduction 
rate and the age parity adjusted net reproduction rate. These rises will 
be misleading, since the underlying conditions of reproductivity have 
remained the same. The question is: is the more refined age parity 
adjusted rate likely to be less sensitive to such temporary fluctuations 
in marriage rates and hence more satisfactory as a measure of repro- 
ductivity? The exact answer to this question will of course vary ac- 
cording to the particular case under consideration, but the following 
analysis may prove useful. 
Let 
u(x) be the rate of mortality for females aged z; 
pi(x) be the rate at which zero parity females produce offspring at 
age x; and 
F, be the number of first births in a hypothetical cohort of females 
subject to mortality conditions u(x) and fertility conditions 
p(x). 
Then 


50 
(1) F, = f ely u(z)t71(2)) dem, (7) da 
0 


where 50 years is taken as the upper limit of reproductivity. In prac- 
tice we can write 


= 50 
(2) F, = Face f eSorr(2dep, (x) dx 


0 
where Z is the mean age at confinement in the distribution of 
e~Syrr(s)dep, (x), 
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It can be seen that 


(3) F,? = f e-Sy?1(#) dep, (x) dx 


is the number of first births in a hypothetical cohort of females subject 
tono mortality and fertility conditions p:(x). It stands in the same rela- 
tion to the gross reproduction rate as F' does to the net reproduction 
rate. Integrating (3), we can write 


(4) F,o = 1 — ety 02042, 


If the rates p:(x) are increased by a constant fraction k, this will lead 
to an increase in F,° of a fraction, say, c, and we will have 


(5) (1 + c)Fio = 1 — eH th) fMori(edz = 1 — (1 — F,%)1*#, 


Hence 
1 -_ (1 _— F,°)'+* 
F,¢ 


But since the increase in the p;(x) rates will somewhat lower the mean 
age at confinement in the hypothetical cohort which makes up F:%, 
from (2) it can be seen that F; will increase by an amount a little greater 
than c. For practical purposes we may assume that F; increases by c. 
It can immediately be seen from (3) that, if p:(x) increases by k, the 
increase, c, in F,? must be less than k. This means that an increase 
in the zero parity age specific fertility rates, pi(x), of k will raise 
the first births component of the age parity adjusted net reproduc- 
tion rate by a smaller fraction c, where c is given by (6) above. Ac- 
cordingly the total of one parity females in the hypothetical cohort 
will be raised by c, but the increase will not be uniform over all ages. 
For if fifteen years is taken as the earliest age of child-bearing, then 
the number of first children born to females aged fifteen in the hypo- 
thetical cohort will be increased by k, but the number born to females 
aged sixteen will be increased by less than k, since the number of zero 
parity females exposed to the risk of producing their first offspring will 
have been reduced by the higher zero parity specific fertility rate op- 
erating at age fifteen. So the increase in first children born to females 
aged x in the hypothetical cohort will be continuously reduced as 2 
increases and will, in fact, ultimately become negative. Hence the main 
weight of the increase in the total of one parity females will be in the 
earlier ages. Since the one and higher parity age specific fertility rates 





(6) l+c= 
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are not affected, ex hypothesi, by the rise in p:(x) and since they are 
greater in value in the younger age groups the second and higher order 
births will increase by a fraction greater than c, so that the age parity 
adjusted net reproduction rate will increase by more than c. The table 
below sets out values of c for k equals 0.1 and 0.05 with various values 
of F,% and the approximate corresponding values of F;.' 


TABLE 1 











k=0.1 k=0.05 





0.049 0.025 
0.037 0.019 
0.023 0.012 














If, in the year in which the rise in the marriage rates which causes 
pi(xz) to increase by k occurs, the actual population had the same 
distribution in respect of age and parity as the hypothetical cohort 
implied in the given conditions of fertility and mortality, then the 
contribution of first births to the simple age adjusted net reproduction 
rate will be given by F:/(1+m), where m is the masculinity at birth, 
so that the simple rate would as a result of the increase in first births 
by k be raised by a fraction 


kF, 


= (1 + m)Ro 


where Rp is the net reproduction rate. The table below gives values 
of g for m=1.05 and Ryo=1.00 with the values of F; in table 1 above. 
Of course, the higher Ry the lower will be g. 


TABLE 2 











k=0.1 k=0.05 





0.032 0.016 
0.037 0.018 
0.042 0.021 














* Dr. Whelpton gives for 1942 F:° =0.875 (p. 505) and F; =0.824 (p. 509). This gives F,/Pe 
=e-/0"u(s)ds 9,942. For the purposes of Table 1, we have taken F: =0.942 F.°. 
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A rise in marriage rates in one year which increases first births in the 
subsequent year by k will increase the age adjusted net reproduction 
rate by g and the age parity adjusted net reproduction rate by a frac- 
tion greater than c. It is seen by comparing the two tables for c and g 
above that the increase in the age parity adjusted rate would probably 
in fact not be insignificant compared with the increase in the age ad- 
justed rate, but in many instances would be greater than it. It should 
be noted that the greater the probability at birth of a female having 
one child (F:) the smaller will be the rise in the age parity adjusted 
rate, because the greater F, the less opportunity it has for increase 
due to an increase in zero parity age specific fertility rates. The precise 
effect on the two rates of an increase in first births could only be worked 
out by using actual data and tracing through the effect of the increase. 
It would, however, be well worth while doing. 

The above reasoning refers to age parity adjusted net reproduction 
rates. But it could also be applied to age parity fecundity marriage 
adjusted rates as calculated by Dr. Whelpton. Such rates would rise 
as a result of an increase in first births but they would not rise as much 
as the age parity adjusted rates, since the probability at birth of a 
marriageable and fecund female having one child will be higher than 
the probability at birth of any female having one child and hence will 
have less opportunity of being increased by an increase in zero parity 
age specific fertility rates. 

The argument set out above has been restricted to an examination of 
the effect of a temporary rise in marriage rates in one year on age ad- 
justed and age parity adjusted net reproduction rates in the following 
year and as such has been concerned with the increase which will occur 
in first births only. Of course the rise in marriage rates will result in 
increases in second and higher order births in later years. These in- 
creases will cause a rise in the age adjusted net reproduction rate when 
they occur, but they will not in themselves cause a rise in the age parity 
adjusted net reproduction rate, since ex hypothest the one and higher 
parity age specific fertility rates are assumed to remain constant. 
Consequently although the immediate effect of a temporary rise in 
marriage rates may be to inflate the age parity adjusted more than the 
age adjusted net reproduction rate, the former will probably return 
to its old level sooner than the latter. 

Argument along the lines presented here suggests that Dr. Whelp- 
ton’s reproduction rates may be even more sensitive to fluctuations in 
marriage rates than the conventional reproduction rates and this seems 
to be a marked disadvantage. A definition of the zero parity specific 
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fertility rates in terms of married women only would avoid this dif- 
ficulty. 

Dr. Whelpton shows that his more refined measures indicate a greater 
rise in fertility than that indicated by the conventional rates for recent 
years and he specifies the period 1940-1943. He says (p. 514): “Re. 
production rates adjusted for age and parity, or for age, parity, steril- 
ity, and spinsterhood, show the wartime rise in fertility to have been 
somewhat larger than has been thought on the basis of the conventional 
reproduction rates.” Now, between 1939 and 1942, the United States 
marriage rate rose sharply. It can be suggested that the greater rise 
in the refined measures may be thus largely illusory, and the actual 
rise in fertility smaller, not larger, than that indicated by the simple 
reproduction rates. This certainly was the case in other countries, 
e.g., Australia, England and Wales, and Sweden, where calculations of 
reproduction rates controlled for nuptiality, both in respect of propor- 
tions married and durations of marriage, have been made.? These cal- 
culations show that the simple age adjusted reproduction rates exag- 
gerated the rise in fertility over the war period, due to the effect of 
fluctuations in marriage rates. It follows that it is dangerous to draw 
the conclusion from Dr. Whelpton’s calculations that the rise in war- 
time fertility in the U.S.A. has been larger even than that shown by the 
conventional net reproduction rates, and the principal purpose of this 
note is to draw attention to this fact. 

Finally, it may be noted that the specific fertility rates for women of 
given parity and age will depend very largely on the distribution of 
these women as between the durations of time since their last confine- 
ment, so that ideally we require rates giving the probabilities of mar- 
ried women aged x and married d years of parity n bearing a child of 
order n+1, w, years after having borne a child of order n. For zero 
parity women, w, becomes the duration from marriage. Although it 
may in practice be impossible to calculate such detailed rates, it is 
worth realising that they are theoretical desiderata. 





2 See, for example, P. H. Karmel, “Fertility and Marriages—Australia 1933-42,” Economic Record, 
June 1944, p. 74; J. Hajnal, “The Analysis of Birth Statistics in the Light of the Recent International 
Recovery of the Birth Rate,” Population Studies, September 1947, p. 137. 
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COMMENTS ON MR. KARMEL’S NOTE 


P. K. WHELPTON 
Scripps Foundation for Research in Population Problems 


R. KARMEL’s carefuly analysis provides the basis for two state- 
ments which seem to me to summarize his criticisms. In the third 
from last paragraph he writes “ . . . Dr. Whelpton’s reproduction rates 
may be even more sensitive to fluctuations in marriage rates than the 
conventional reproduction rates and this seems to be a marked disad- 
vantage.”! In the next to last paragraph he writes “It follows that it is 
dangerous to draw the conclusion from Dr. Whelpton’s calculations 
that the rise in wartime fertility in the U.S.A. has been larger even 
than that shown by the conventional net reproduction rates, and the 
principal purpose of this note is to draw attention to this fact.” 
Although these statements may be true regarding the age-parity 
adjusted net reproduction rate in the frame of reference Mr. Karmel 
uses, I believe they are not true regarding the age-parity-fecundity- 
marriage adjusted rate.? In what follows I will try to show that (1) 
the age-parity-fecundity-marriage adjusted net reproduction rate 
(assuming that 10 per cent of the women living to age 50 remain 
spinsters and 10 per cent of those who marry cannot have a child) is 
much superior to the age-parity adjusted net reproduction rate from a 
theoretical standpoint and should have been given much greater em- 
phasis than the latter in my previous article, and neither (2) the first, 
nor (3) the second of Mr. Karmel’s quoted statements is true for the 
age-parity-fecundity-marriage (81%) adjusted net reproduction rate.* 





1 It is important to emphasize that Mr. Karmel uses “rates” (plural) in the first phrase, and pre- 
sumably is referring to both the age-parity, and the age-parity-fecundity-marriage, adjusted net repro- 
duction rates. He does so specifically a little earlier in the sentences “The above reasoning refers to 
age-parity adjusted net reproduction rates. But it could also be applied to age-parity-fecundity-marriage 
adjusted rates as calculated by Dr. Whelpton.” 

2 In his second paragraph Mr. Karmel states that changes in marriage rates, by themselves, do not 
affect “the underlying conditions of reproductivity.” In this frame of reference I accept his conclusion 
regarding the age-parity specific net reproduction rate. It must be remembered, however, that changes 
in marriage rates, by themselves, can affect the rate of reproduction—the rate of natural increase—of a 
nation’s population. If reproduction is being considered from this point of view the age-parity adjusted 
rate has utility. 

3 Assuming that 10 per cent of the women remain spinsters til] age 50 and 10 per cent of those who 
msrry cannot have a child is equivalent to assuming that 81 per cent will marry and be able to bear a 
child, for (100 —10)% X(100 —10)% =81%. 
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1. The Age-Parity-Fecundity-Marriage (81%) Adjusted Net Reproduc- 
tion Rate Is Much Superior to the Age-Parity Adjusted Net Repro- 
duction Rate 


It is known that specific probabilities have important advantages 
over general probabilities for many purposes (e.g., computing a net 
reproduction rate), and that in computing probabilities events should 
be related to the group at risk. Classifying women by parity (as well as 
age) and births by birth order (as well as age of mother) not only 
increases the specificity of birth probabilities for use in computing a 
net reproduction rate but is a great improvement in relating events 
to the group at risk, for only zero-parity women can have first births, 
one-parity women second births, etc. The elimination of completely 
sterile women from the group at risk, and that of women who do not 
marry before the end of the childbearing period, also are important, 
for completely sterile women can never bear a child and the number of 
births to women who do not marry by age 50 is negligible.‘ 

The proportion of the native white women of a given age who will be 
single when they reach age 50 cannot be foretold exactly, but in the 
United States probably will continue to be close to 10 per cent, the 
figure I used previously for reasons noted.® The basis for estimating the 
incidence of complete sterility is much less adequate than that for 
spinsterhood. In my previous article I gave reasons for assuming that 
10 per cent was a maximum estimate for the proportion of white 
women completely sterile throughout the reproductive period.* More- 
over, I stated that the age-parity-fecundity-marriage adjusted net re- 
production rates based on allowances of 10 per cent for spinsterhood 
and 10 per cent for complete sterility (leaving 81 per cent of the women 
entering the reproductive period in the hypothetical cohort potentially 
at risk)’ “.. . represent extreme values. The true values are between 
these and the rates adjusted for age and parity, bui undoubtedly are 
closer to the former than the latter.” I regret exceedingly that Mr. Karmel 
did not give more weight to the phrase italicized here and center his 





4 Complete sterility was defined in my previous paper as the lack of the physiological ability to 
participate in 1eproduction at any time (P. K. Whelpton: “Reproduction Rates Adjusted for Age, 
Parity, Fecundity, and Marriage,” Journal of the American Statistical A iation, December 1946, Vol. 
41, pp. 507-8). The concept might be clarified by pointing out that a person who would be classified 
as completely sterile if not treated by a physician would be classified as fecund if the treatment was 
successful. 

5 Op. cit., pp. 513-4. 

6 Op. cit., pp. 512-3. 

7 To save space, this rate will be referred to hereafter as the a-p-f-m. (81%) adjusted n.r.r., the age- 
parity adjusted net reproduction rate as the a-p. adjusted n.r.r., and the conventional rate as the age 
adjusted n.r.r. 
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discussion on the a-p-f-m. (81%) adjusted n.r.r., instead of saying that 
his conclusions regarding the a-p. adjusted n.r.r. apply also to the 
a-p-f-m. (81%) adjusted n.r.r.® 

In the previous article I should have emphasized that there are other 
factors which affect the net reproduction rate in the same way as com- 
plete sterility. The most obvious is the strong desire of some married 
couples to be childless—a desire made effective by contraception. An- 
other is the lack of interest in married life which leads some single 
women who are fecund (i.e., able to bear children) while they are 18 to 
25 years of age but become sterile between 30 and 35, for example, to 
postpone marriage until they are 35 to 40 and cannot have children. A 
similar factor is the lack of interest in having children which leads some 
couples who are fecund during the first years of their married life to 
postpone starting a family until it is made impossible by such condi- 
tions as divorce, the death of the husband, premature menopause, etc. 
In other words, although 10 per cent may be a maximum allowance for 
the physiological conditions which make the bearing of a child impossi- 
ble for the native white American women in current cohorts, it proba- 
bly is well below a maximum allowance, and well above a minimum 
allowance, for all the physiological and socio-psychological conditions 
which are involved.® In consequence, references to the a-p. adjusted 
n.r.r.’s might well have been omitted from section 5 of my previous 
article, and the conclusions based on the a-p-f-m. (81%) adjusted n.r.r. 
which has allowances of 10 per cent each for spinsterhood and for what 
might be called physiological and socio-psychological sterility. 


2. The A-P-F-M. (81%) Adjusted N.R.R. Is Not More Sensitive Than 
the Age Adjusted Rate to Changes in Marriage Rates 


On the basis of data in his Tables 1 and 2 which show the effect of an 


8 Perhaps he would have concentrated on the latter if I had pointed out the following: 8 per cent is a 
minimum allowance for spinsterhood and 5 per cent for complete sterility, which leaves at risk 87.4 per 
cent of the women entering the reproductive period in the hypothetical cohort. A medium allowance is 
84.2 per cent, half way between the extremes of 81.0 and 87.4. Net reproduction rates on an 84.2 per 
cent basis could be estimated very roughly by averaging (a) the a-p. adjusted n.r.r. (which assumes 100 
per cent of the women are at risk) and (b) the a-p-f-m. (81%) adjusted rates, giving the former a weight 
of 1 and the latter a weight of 3 or 4. Such rates would be much closer to my a-p-f-m. (81%) adjusted 
n.r.r.'s than to my a-p. adjusted rates. 

® Since the proportion that cannot bear a child because of socio-psychological conditions may be 
more variable than the proportion unable because of physiological conditions, it is important that the 
allowance for the former represent only the “hard core’—the minimum proportion under the conditions 
that have prevailed, or are expected to prevail, during the period in question. 

Some readers may object to combining such factors as the lack of interest in married life or in having 
children with the lack of the physiological ability to participate in reproduction, their reason being that 
the amount of interest in these matters may increase and lead to more childbearing. If so, they should 
remember that treatment by a competent gynecologist can overcome the physiological inability of 
couples to have children in many cases. 
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increase in first births caused by an increase in marriage rates, Mr. 
Karmel concludes that “the increase in the age parity adjusted rate 
would probably in fact not be insignificant compared with the increase 
in the age adjusted rate, but in many instances would be greater than 
it.” If he had compared the a-p-f-m. (81%) adjusted n.r.r. with the age 
adjusted n.r.r., I believe he would have concluded that the increase in 
first births would not affect the former more than the latter. 

As Mr. Karmel writes “The precise effect on the two rates of an in- 
crease in first births could only be worked out by using actual data and 
tracing through the effect of the increase. It would, however, be well 
worth while doing.” The actual data used in the accompanying exam- 
ples are for the years 1929, 1933, 1941, 1946, and 1947. They were 
chosen for the following reasons: 1933, because it ranks lowest of all 
years from 1920 to 1947 with respect to F:% and the net reproduction 
rate; 1946 and 1947, because they rank first and second (or third), re- 
spectively, in such an array; 1929, because it ranks toward the middle; 
and 1941, partly because of middle rank, and partly because it follows 
the only year for which we have fairly reliable estimates of the age 
specific marriage probabilities of single women.'® Because of the amount 
of work involved, increases of 10 per cent will be used for k, as was done 
by Mr. Karmel, but not increases of 5 per cent.” 

A rise of 10 per cent in the number of first births at each age of 
mother due to higher marriage rates in the previous year (with other 
conditions remaining unchanged) would have raised the age adjusted 
n.r.r. and the a-p-f-m. (81%) adjusted n.r.r. for 1929 by 3.1 and 3.9 
points, respectively; for 1933 by 2.8 and 3.8; for 1941 by 4.2 and 3.7; 
for 1946 by 5.2 and 3.1; and for 1947 by 6.4 and 2.8. (See Table I, 
column D.)” For 1929 and 1933 the increase of the a-p-f-m. (81%) 





10 F,? as used by Mr. Karmel refers to the number of first births per 100 women in a hypothetical 
cohort exposed to no mortality and to the birth rates used in computing the net reproduction rate. The 
F,@ values mentioned here are based ona birth rates which are either age specific for all women, or age- 
parity specific for the 81 per cent of the women who (it is assumed) will marry and can have a child. 

Computations were made for 1946 before data for 1947 became available. A reliable a-p-f-m. (81%) 
adjusted n.r.r. cannot be computed for years before 1920 because of the small size of the birth registra- 
tion area. 

ul & is the constant fractional increase (in these examples 10 per cent) which is assumed to have 
occurred in the actual number of first births to women of each age. 

It should be noted that 10 per cent is an extreme value for the United States. An examination of 
1940-41 relationships indicates that a uniform increase of approximately 18 per cent in the age specific 
matriage probabilities of single women in 1940 would have been required to raise the number of first 
births in 1941 by 10 per cent, assuming no change in marital fertility or other conditions. Only once 
(i.e., from 1945 to 1946) in the 82 years for which marriage rates have been estimated does there appear 
to have been a change of this magnitude. In other words, the examples to be given represent the effect of 
extreme changes rather than those which occur frequently. 

12 It should be noted that Mr. Karmel uses 1 as the radix of the n.r.r., and that (in accordance with 
common American practice) I use 100. 
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TABLE I 


AGE ADJUSTED, AND AGE-PARITY-FECUNDITY-MARRIAGE (81%) ADJUSTED, NET 
REPRODUCTION RATES FOR 1929, 1933, 1941, 1946, AND 1947, BASED ON ACTUAL 
BIRTHS, AND ON BIRTHS OF SPECIFIED ORDER INCREASED BY 10 
PER CENT AT EACH AGE OF MOTHER 








With 10 Pe: Cent Increase at Each Age in 





With Actual Births 
First Births Births of Each Order 





Year, and Type of Rate* Increase Increase 
F.%t .R.R. N.R.R. |in N.R.R.| N.R.R. | in N.R.R. 
(C —B) (E—B) 





D 





1929 
Age Adjusted N.R.R. 
A-P-F-M. Adjusted N.R.R. 
Difference 


1933 
Age Adjusted N.R.R. 
A-P-F-M. Adjusted N.R.R. 
Difference 


1941 
Age Adjusted N.R.R. 
A-P-F-M. Adjusted N.R.R. 
Difference 


1946 
Age Adjusted N.R.R. r é . 
A-P-F-M. Adjusted N.R.R. a id - 141.3 
Difference A é 5 2 8.0 


1947 

Age Adjusted N.R.R. , i : i 165.1 1 
A-P-F-M. Adjusted N.R.R. . J i d 152.4 1 
Difference z ‘ d 12.7 


5. 
4. 
0. 























* In each case the a-p-f-m. adjusted n.r.r. has the allowances of 10 per cent for spinsterhood and 
10 per cent for sterility. 

t F:@ has the meaning given by Mr. Karmel, namely, the number of first births per 100 women 
in a hypothetical cohort exposed to no mortality and to specific birth rates used in computing the net 
reproduction rates for the year in question. 
adjusted n.r.r. is larger than that of the age adjusted rate, which sup- 
ports Mr. Karmel’s criticisra, but the differences amount to only 0.8 
and 1.0 points, respectively. For 1941, 1946, and 1947 the reverse is 
true, the increases of the a-p-f-m. (81%) adjusted rate being, respec- 
tively, 0.5, 2.1, and 3.6 less than the increase of the age adjusted rate. 
The reason for these variations is the difference in the values of F:% and 
the n.r.r. for the years in question.” As Mr. Karmel pointed out, k 





13 Column A of Table I illustrates the absurdity of the F:% values that are implicit in the method- 
ology of the age adjusted n.r.r., which reflects unfavorably on the accompanying R» values. According 
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causes an increase (2) in the age adjusted n.r.r. which varies directly 
with F,% and inversely with Ro, and (b) in the a-p-f-m. (81%) adjusted 
n.r.r. which varies inversely with F,°. He might have added that the 
increase in the latter rate which is caused by k also varies inversely 
with Ro. 

An examination of Table I indicates that uniform percentage changes 
in the number of first births at each age of mother will cause smaller 
changes in the a-p-f-m. (81%) adjusted n.r.r. than in the age adjusted 
rate if the value for F:° with the original number of first births is ap- 
proximately 78 or more (i.e., when there would be 78 or more first | 
births per 100 women living through the childbearing period) according 
to the a-p-f-m. (81%) adjusted reproduction table. When F,% (thus 
defined) is less than 78 the reverse is true. The dividing point is set at 
approximately 78 because the assumed 10 per cent increase in first 
births due to increased marriage rates raises the age adjusted n.r.r. 1.0 
and 0.8 points less than the a-p-f-m. (81%) adjusted n.r.r. for 1933 and 
1929, respectively, when F;® is 3.0 and 0.4 points below 78, but raises 
the age adjusted n.r.r. 0.5, 2.1, and 3.6 points more than the other for 
1941, 1946, and 1947 when F,® is 0.5, 2.7, and 3.0 points above 78." 
Since F,° is above the dividing point for 14 of the 28 years from 1920 
through 1947, the a-p-f-m. (81%) adjusted n.r.r. appears to exaggerate 
the effect of changes in marriage rates no more frequently than the age 
adjusted rate during the period for which United States data are avail- 
able. Furthermore, the maximum distortion appears to be substantially 
larger for the age adjusted n.r.r. than for the a-p-f-m. (81%) adjusted 
n.r.r. since what was by far the largest annual increase in the marriage 
rate during the period under consideration (from 103.2 in 1945 to 147.7 
in 1946) was followed a year later by an increase of 14.4 for the age 
adjusted n.r.r. compared with 10.3 for the other." 





to this methodology there would be 138 first births per 100 women in a cohort exposed to no mortality 
and the age specific first birth probabilities of 1947. 

44 The importance of using the value for F:? which agrees with the original marriage probabilities 
and first births may be illustrated with 1929. For that year F,° is 77.6 when computed from the actual 
number of first births, but is 78.5 when computed on the assumption of marriage probabilities suffi- 
ciently higher to have increased first births by 10 per cent. 

8 The discussion is limited to the period 1920 to date because a reliable a-p-f-m. (81%) adjusted 
n.r.r. cannot be computed for earlier years. The marriage rate used here (and in the remainder of this 
paper) is the number of marriages per 1,000 women, aged 17-29. It is preferred to the more commonly 
used crude rate (the number of marriages per 1,000 persons in the population) because it approaches 
more closely the ideal of relating events to the group at risk. (Over 75 per cent of all the women marrying 
in the United States marriage registration area in 1939 and 1940 were in the 17-29 age group.) A much 
better rate—the number of marriages per 1,000 single women aged 15 to 44—has been estimated by the 
National Office of Vital Statistics for each census year from 1890 to 1930, and for each subsequent year, 
but unfortunately is not available for 1921 to 1929. 
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TABLE II 


MARRIAGE RATE, INDEX OF MARITAL FERTILITY, AGE ADJUSTED NET REPRODUC- 

TION RATE, AGE-PARITY-FECUNDITY-MARRIAGE (81%) ADJUSTED NET REPRO- 

DUCTION RATE, AND FIRST BIRTH COMPONENT OF THE LATTER, FOR NATIVE 

WHITE WOMEN IN THE UNITED STATES, 1920 TO 1947, AND ANNUAL CHANGES OF 
THESE RATES 











First Annual Change of 
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(All Marital in Hypo- of Ad- 
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* The number of marriages per 1,000 women 17 to 29 years of age, from Whelpton, P. K., Eldridge, 
Hope Tisdale and Siegel, Jacob S.: Forecasts of the Population of the United States, 1945-1976, Bureau of 
the Census, Washington, G.P.O., 1947, p. 27. 

¢ The number of births during the calendar year per 1,000 ever married women aged 15 to 44, 
inclusive, at the beginning of the year. 

¢ The number of first births per 100 women living to age 50 exposed to the a-p-f-m. (81%) specific 
birth rates of the year indicated, computed arithmetically from the net reproduction tables. 

§ Between 80.95 and 81.00, the computed value being 80.96. 
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3. The Age Adjusted N.R.R. Understates Slightly the Rise in Wartime 
Fertility in the United States 


In the second quoted sentence at the beginning of these comments 
Mr. Karmel questioned the accuracy of my earlier statement “Repro- 
duction rates adjusted for age and parity, or for age, parity, sterility, 
and spinsterhood, show the wartime rise in fertility to have been some- 
what larger than has been thought on the basis of the conventional 
reproduction rates. From 1940 to 1943 the age adjusted net reproduc- 
tion rate increased 21.0 per cent (from 100 to 121), the age-parity 
adjusted rate increased 27.6 per cent (from 98 to 125), and the age- 
parity-fecundity-marriage adjusted rate increased 22.7 per cent (from 
97 to 119).”** Although he may be correct with reference to the age- 
parity adjusted rate, I shall try to show that he is incorrect with regard 
to the a-p-f-m. (81%) adjusted n.r.r. 

Whether the rise of the marriage rate from 90.6 in 1939 to 114.5 in 
1942 would exaggerate the 1940—43 rise of the age adjusted n.r.r. more 
than that of the a-p-f-m. (81%) adjusted n.r.r. depends on the values 
for F,°%, as was brought out in section 2. That for 1940 is 77.4, which is 
below the dividing point of 78 described in section 2. The 1941 value is 
78.5, but it might have been below 78 if the marriage rate had not 
risen. For 1942 and 1943 F,% is 79.9 and 79.8—well above 78. (See 
Table II, column E.) It appears, therefore, that (a) the 1939-40 in- 
crease of the marriage rate should exaggerate the 1940-41 increase of 
the age adjusted n.r.r. less than that of the a-p-f-m. (81%) adjusted 
n.r.r., (b) the 1940-41 increase of the marriage rate should affect the 
two reproduction rates about equally, and (c) the 1941-42 increase 
of the marriage rate should exaggerate the 1942-43 increase of the 
age adjusted n.r.r. more than that of the other. Actually, the 1940-41 
increase of the age adjusted n.r.r. is 5.3, that of the other is 5.6, and the 
difference between them is in the expected direction. (See Table II, 
columns H and I.) In contrast, the 1941-42 increases are 11.1 and 
9.8, the 1942-43 increases are 4.7 and 7.4, and the differences are not of 
the type expected. 

To clarify the situation it is necessary to consider how the age ad- 
justed n.r.r. and the a-p-f-m. (81%) adjusted n.r.r. are affected by 
changes in marital fertility. Because Mr. Karmel says nothing about the 
matter, readers of his notes may assume that the age adjusted n.r.r. 
and the a-p-f-m. (81%) adjusted n.r.r. are affected identically by such 
changes. This is not cori >*+ for first births, as was proved in section 2. 
Although it was assumed there that an increase of 10 per cent in first 





8 Op. cit., pp. 514-5. 
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births occurred because of higher marriage rates (with no change in 
marital fertility at zero parity), the changes in the two net reproduction 
rates would have differed by the same amounts if the assumed cause of 
the additional first births had been higher fertility rates of zero-parity 
married women (with no change in marriage rates). 

But how would the two reproduction rates behave if the number of 
births to married women of all parities were increased by 10 per cent? 
Again the answer at present must be based on examples rather than 
formulas, and aggin the years 1929, 1933, 1941, 1946, and 1947 will be 
used as illustrations. These increases of 10 per cent in births would 
have raised the age adjusted n.r.r. and the a-p-f-m. (81%) adjusted 
n.r.r. for 1929 by 10.5 and 13.5 points, respectively; for 1933 by 9.3 and 
11.5; for 1941 by 10.6 and 12.7; for 1946 by 13.6 and 14.0; and for 1947 
by 15.0 and 14.8.!7 (See Table I, column F.) Three of these five increases 
of the age adjusted n.r.r. are smaller by at least 2.0 than the corre- 
sponding increases of the a-p-f-m. (81%) adjusted n.r.r., one is smaller 
by 0.4, and one is larger, but by only 0.2. It is clear, therefore, that the 
age adjusted n.r.r. is less sensitive than the a-p-f-m. (81%) adjusted 
rate to uniform percentage changes in the number of births of each 
order to married women, except when first birth rates are unusually 
high, as they were in 1947.18 

Actual situations are not as simple as those just posited, for all age 
specific or age-parity specific birth rates do not change at a uniform 
rate from one year to another while marriage rates remain constant. 
Thus, from 1945 to 1946 the birth probabilities of zero-parity women 
increased substantially at most ages (by over 30 per cent at most ages 
from 19 to 39) and those of one-parity women, two-parity women, etc., 
increased successively in lesser degree and at fewer ages (though even 
for five-parity women, increases outnumbered decreases). The preced- 
ing discussion would lead one to think that under these circumstances 
the rise from 1945 to 1946 would be smaller for the age adjusted n.r.r. 
than for the a-p-f-m. (81%) adjusted n.r.r. Moreover, marriage rates 
were substantially higher in 1945 than in 1944, which (according to Mr. 
Karmel) would be expected to cause a similar differential. In fact, how- 
ever, the age adjusted n.r.r. rose 24.6 and the a-p-f-m. (81%) adjusted 
rate 19.9, or 4.7 less. That this reversal of the expected relationship is 





17 A 10 per cent increase in marital fertility rates at each age and parity would have been theoreti- 
cally possible in these years because a much larger increase would have been required to raise any of 
these rates to unity. 

18 In such a year the age adjusted n.r.r. exaggerates the change. As shown in Table I, column A, 
the methodology of this rate ascribes 138 first births to each 100 women living to age 50 under conditions 
of 1947, which obviously is impossible. Increasing the actual number of births of exch order by 10 per 
cent would cali for 152 first births per 100 women, which is even more impossible. 
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due to the peculiar pattern of the changes in the birth probabilities is 
indicated by the fact that applying the 1945-46 rates of change of the 
various age-parity specific birth probabilities to the appropriate 1941 
probabilities increases the 1941 age adjusted n.r.r. by 23.9 and the 
a-p-f-m. (81%) adjusted rate by 20.8, a difference of 3.1 in the same 
direction. 

These and other comparisons provide the basis for concluding: The 
larger the percentage increase in the birth probabilities for lower pari- 
ties relative to those for higher parities, the larger the increase in the 
age adjusted n.r.r. relative to the a-p-f-m. (81%) adjusted n.r.r. The 
converse is equally true; moreover, similar statements can be made 
regarding decreases. 

Showing that the two n.r.r.’s under consideration are not affected 
identically by given changes in marital fertility does not prove that one 
of the rates measures the effect of such changes more accurately than 
the other. The writer maintains, however, that the methodological su- 
periority of the a-p-f-m. (81%) adjusted n.r.r. (discussed in section 1) 
justifies the tentative conclusion that this rate does reflect changes in 
marital fertility more accurately than the age adjusted n.r.r. 

The discussion of the 1940-43 changes may now be continued. From 
1940 to 1941 the number of births per 1,000 ever married women aged 
15 to 44'* rose 5.9 points (from 123.7 to 129.6), due mostly to an in- 
crease in first births. (See Table II, column B.) From 1941 to 1942 the 
index of marital fertility jumped 13.2 points, again due mostly to an 
increase in first births. From 1942 to 1943, in contrast, the index rose 
only 3.3 points, the first birth rate declined, and the rates for fifth, 
sixth, seventh, and eighth (and higher) births rose for the first time 
since 1921. These changes would be expected to exaggerate the increase 
of the age adjusted n.r.r. from 1940 to 1941 and from 1941 to 1942, but 
to have the opposite influence from 1942 to 1943. As stated at the be- 
ginning of this section, changes in the marriage rate tend to exaggerate 
the increase of the a-p-f-m. (81%) adjusted n.r.r. from 1940 to 1941 
and of the age adjusted n.r.r. from 1942 to 1943. The net effect of these 
opposing influences seems to be slightly in favor of the a-p-f-m. (81%) 
rate. In other words, the change in the fertility of all native white 
women from 1940 to 1943 should be shown slightly more accurately by 
the 22.7 per cent rise in the a-p-f-m. (81%) adjusted n.r.r., than by the 
21.0 per cent rise in the age adjusted n.r.r. 

Other comparisons can be made by using the data in Table II. Thus, 
1925-29 is a very good example of a period when the changes of fertility 





19 This ratio will be used hereafter as an index of marital fertility. 
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rates were relatively large and those of marriage rates (one year earlier) 
were relatively small, since the marriage rate of all women declined 
slowly from 91.5 in 1924 to 86.5 in 1928, and the index of marital 
fertility dropped from 161.7 in 1925 to 136.8 in 1929. (See Table II, 
columns A, B, F, and G.) The larger declines of the a-p-f-m. (81%) 
adjusted rate from year to year during this period should describe what 
was happening more accurately than the smaller declines of the age 
adjusted rate. (See columns H and I.) The differences in the annual 
declines are small, however, being 0.3 for 1926-27 and 1.1 for the other 
three pairs of years. 

Illustrations of the opposite situation—the index of marital fertility 
relatively constant, the marriage rate fluctuating enough to affect the 
first birth rate significantly, and the age adjusted n.r.r. measuring 
changes in fertility more accurately than the other—are more difficult 
to find, for it is necessary also that the value for F,°% be relatively low. 
The best example seems to be 1933-38, for the marriage rate one year 
earlier rose over 40 per cent (from 68.8 in 1932 to 96.4 in 1937), the 
index of marital fertility fluctuated only slightly (between 119.5 and 
124.1), F,% did not exceed 77.1, and the increase in the birth probabili- 
ties at lower parities (for most ages) and decreases at higher parities 
(at most ages)—which were nearly compensating as far as the index 
of marital fertility is concerned—would tend to raise the age adjusted 
n.r.r. more than the other. Under these conditions, the gain of 4.9 in 
the age adjusted n.r.r. from 1933 to 1938 should show what happened 
more accurately than the gain of 9.3 in the a-p-f-m. (81%) adjusted 
n.r.r. During this period, also, the absolute differences in the annual 
change of the two rates were small, the largest being 1.7 points for 
1933-34, following a 7.4 point rise in the marriage rate from 1932 to 
1933. 

Had my previous article been written three years later it would have 
contained a statement like the following: Reproduction rates adjusted 
for age, parity, sterility, and spinsterhood show the postwar rise in 
fertility to have been substantially smaller than is indicated by the con- 
ventional rates (adjusted for age only). From 1945 to 1947 the age 
adjusted n.r.r. jumped 39.0 points (from 111.1 to 150.1) and the 
a-p-f-m. (81%) adjusted rate 30.2 points (from 107.4 to 137.6). Since 
the marriage rate skyrocketed from 94.1 in 1944 to 147.7 in 1946, Mr. 
Karmel’s argument would lead one to expect the a-p-f-m. (81%) ad- 
justed rate to have the larger rise; actually its rise was smaller by about 
23 per cent. It is under such unusual conditions that the methodology 
of the a-p-f-m. (81%) adjusted rate is of greatest value. 





NOTE ON “AN ATTEMPT TO GET THE NOT-AT-HOMES 
INTO THE SAMPLE WITHOUT CALLBACKS” 


ALFRED Po.itz 
AND 


WILLARD SIMMONS 


This note provides additional information concerning the 
plan for eliminating the need for callbacks described in the 
March, 1949 issue of the Journal. Attention is called to the 
difference in the effects of clustering in using the “nights-at- 
home” plan or callbacks, and an error in the calculation of the 
sampling error of the former is corrected. 


OMMENTS RECEIVED about the above paper appearing in the 
March, 1949 issue indicate that a few points, fundamental to the 
operation of the plan, are subject to possible misinterpretation. 

First, we wish to clarify the difference between asking whether 
respondents were at home at the same time on five preceding evenings, 
as outlined in Part I, instead of questions about moments selected at 
random which form the basis of the theoretical treatment in Part II. 
The assumption involved is stated in paragraph 2, page 17 and its 
consequences are mentioned in footnote 9, page 21. The theory de- 
veloped in Part II shows neither the bias nor the sampling error for 
the plan described in Part I, except insofar as this assumption is valid. 
The development of the theory without such an assumption is the 
subject for an additional paper. It can be shown that the bias con- 
tribution for the plan described in Part I is equivalent to that of a sur- 
vey in which callbacks are made on five consecutive evenings at the 
same time. That is to say, it is unbiased for all persons who were at 
home on any of the six evenings at the specified time. 

From a practical viewpoint, it is impossible to obtain information 
from each respondent for six moments chosen independently at random 
within the survey period as assumed in Part II. Such a plan would re- 
quire that each respondent be visited at the latest of the six moments 
chosen in order for him to state whether or not he was at home at the 
other five moments. But this restriction contradicts the initial hy- 
pothesis that each visit will be made at a randomly determined time. 

Another alternative is to ask about any group of six consecutive 
nights selected at random, but for each night to specify a particular 
moment chosen at random instead of the same moment for all six 
nights. For example, the interviewer might ask: “Did you happen to 
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be at home last night at 8:15?”, “How about night before last at 
7:25?” and so on. These questions can be shown to produce an unbiased 
estimate for all persons at home on any of the six nights at any time 
during interviewing hours. Even here, however, for those persons inter- 
viewed on the first five nights of the interviewing period, some of the 
nights asked about will be prior to the beginning of the interviewing 
period. This involves an assumption regarding changes in the universe 
over time. Assumptions with regard to time, of course, are common to 
all surveys that are not conducted instantaneously. In this case the 
assumption is that the five days preceding the survey period is equiva- 
lent to the last five days of the period with respect to the frequency 
with which persons are away from home. 

In the last two pages of this paper a hypothetical example is given 
comparing the “nights-at-home” plan, Plan A, with a callback opera- 
tion, Plan B. Attention is called to the difference in the effects of clus- 
tering as between Plan A and Plan B. By the hypothesis of this example, 
the initial sampling operation for each plan is one-half as efficient as in 
unrestricted random sampling. This requires more clusters under 
Plan A than under Plan B because of the larger number of initial calls 
permitted in lieu of callbacks. Plan A should not require, however, more 
interviewer visits to clusters than would be incurred in a callback 
operation. For this example, therefore, the nights-at-home plan pro- 
duces a smaller sampling error at a lower field cost. The conditions for 
this example, however, might have been interpreted to mean that equal 
numbers of clusters would be included in the two plans. This assump- 
tion leads to a larger sampling error for the nights-at-home plan 
(1.03%) than for the callback operation (1.006%). 

The sampling error shown for Plan A of .85 per cent as shown on 
page 31, is in error, the correct figure being .91 per cent. This error 
results from the incorrect assumption that ps is equal to unity. The 
correct value of pz# is .612 as determined by evaluating ct#/cton 
where off is given by 
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where 6; equals unity if the jth person is at home and zero otherwise. 
This equation corresponds to the expression for o%? shown in Equation 
21, page 24. 

The authors will appreciate any additional comments from persons 
who are experimenting with this plan or a similar plan. 





ALFRED JAMES LOTKA, 1880-1949 


A brilliant chapter in modern demography came to a close with the 
death of Alfred James Lotka on December 5, 1949. He would have 
reached his 70th birthday next March 2nd. The foundation for his 
life work was laid in his undergraduate studies at Birmingham Uni- 
versity, England, completed in 1901, and at the University of Leipzig, 
Germany, the following year. It was at Leipzig that he developed his 
concepts on the mathematical theory of evolution which were first 
published in 1907. The same year also saw his first paper on population 
analysis, “Relation Between Birth Rates and Death Rates,” in which 
he showed that a population increasing at a prescribed rate and subject 
to a prescribed life table assumes a defini‘’ age distribution with a 
constant birth rate and death rate. A year ai Cornell, in 1908-09, gave 
him an opportunity to concentrate on his interests and led to his fun- 
damental paper in 1911, jointly with F. R. Sharpe, on “A Problem in 
Age-Distribution.” The problem was stated as follows: 


“Given the age distribution in an isolated population at any 
instant of time, the ‘life curve’ (life table), the rate of procreation 
at every age of life, and the ratio of male to female births, to 
find the age distribution at any subsequent instant.” 


In this paper, the mathematical expression for the “true” rate of nat- 
ural increase was first derived. Its solution, with a practical applica- 
tion, came years later. 

During his early years in this country, Dr. Lotka was employed at 
various times by the General Chemical Company, the United States 
Patent Office, the Bureau of Standards, and as an editor with the 
Scientific American. Meanwhile, he continued his investigations and 
came to a point in 1922, when he considered it desirable to spend two 
years at Johns Hopkins to study further and to summarize his thoughts. 
Out of this came his book, “Elements of Physical Biology,” published 
in 1925, where he systematized and developed method for the applica- 
tion of the principles of physics to the study of biological systems. The 
book won international attention. 

Dr. Lotka came to the Statistical Bureau of the Metropolitan Life 
Insurance Company as supervisor of mathematical research in 1924, 
and the next year this JourNAL carried the joint paper with myself, 
“On the True Rate of Natural Increase of a Population.” This paper 
contained the algebraic solution of the mathematical expression for the 
“true” rate of natural increase which had been derived earlier; it also 
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contained an application of the solution to observed data. The paper 
had wide repercussions and reoriented the thinking of students of 
population problems. The Malthusian fears of overpopulation gave 
way to alarm that the Western populations were headed for great 
declines in numbers. Later papers by Dr. Lotka amplified various 
phases of the basic problem quoted previously. His work inspired many 
others throughout the world in their study of population problems by 
quantitative methods. A particular phase of his investigations, on the 
theory of self-renewing aggregates, originated from his population 
problem, but was soon found to be applicable also to the problem of 
industrial replacement and, perhaps more important, to nuclear fission. 

His service with the Metropolitan Life Insurance Company provided 
not only added opportunities and facilities, but also opened new fields 
for his talents—economics and public health. Studies in this direction 
led to three books in which we were joint authors, “The Money Value 
of a Man” (1930, revised 1946), “Length of Life” (1936), and “Twenty- 
Five Years of Health Progress” (1937). A revised edition of “Length 
of Life,” in which Mortimer Spiegelman joined as co-author, was pub- 
lished August 1949. During this period he also wrote “Théorie ana- 
lytique des associations biologiques,” published in Paris in two parts, 
1934 and 1939, which summarized the essentials of his work on the 
mathematical theory of evolution and on the mathematics of popula- 
tion analysis. He retired from the Metropolitan in 1947 with the rank 
of Assistant Statistician and had since then been engaged in a trans- 
lation and revision of this important French publication. 

Dr. Lotka’s achievements brought him great renown and high pro- 
fessional honors. He was President of the American Statistical Associ- 
ation in 1942 and of the Population Association of America in 1938-39. 
Recently he had been active as American Vice-President of the 
International Union for the Study of Population. He was also a mem- 
ber of the International Statistical Institute, a Fellow of the Institute 
of Mathematical Statistics, and a member of the Swiss Actuarial 
Society. 

His many contributions were the product of a keen and creative 
mind. They have given him an honored place in the history of statis- 
tics. 

Louis I. DuBLIN 





BOOK REVIEWS 


(Pending the establishment of a new editorial board, the review section has 
been edited for the Secretary’s Office since August 1949 by Dr. Ernest Rubin.) 


Business Cycles and Forecasting. Third Edition. Elmer Clark Bratt (Professor 
of Economics, Lehigh University, Bethlehem, Pa.). Chicago, Ill., Richard D. 
Irwin, Inc., 3201 S. Michigan Ave. 1948. Pp. vi, 585. $5.00. 


REVIEW BY JoserH A. SCHUMPETER 


Department of Economics, Harvard University 
Cambridge Massachusetis 


_— well-deserved success of the previous editions of this textbook is likely 
to be more than matched by the present revision. The author has suc- 
ceeded in incorporating new facts, new approaches, and new problems with- 
out overloading the ship. From the undergraduate’s standpoint, the attempt 
“to provide the background for explaining economic change”’ (p. v) qualifies 
the book not only to serve the prupose indicated by its title but also to sup- 
plement the command of economic facts and methods he may be supposed 
to have previously acquired in his general courses. In graduate work, also, 
it may serve as a useful introduction to, or preliminary survey of, the sub- 
ject of business cycles. The author’s mature and balanced judgment, par- 
ticularly in evidence where most needed, namely in the chapter on Economic 
Planning and Full Employment (XXII: see especially his statement of what 
he calls the continuous-policy approach, p. 534), shows up to advantage 
throughout. 

100 pages out of 550 are devoted to a history of business cycles from 1784- 
1947 that is fragmentary to 1929 but rather full for the last twenty years. 
Teachers as well as students should be grateful for this method, the only one 
available, of imparting to the student something like experience in these 
matters which is so important in order to appreciate the difficulties that 
theoretical and statistical methods of cycle analysis have to face. Therefore 
it is perhaps less than fair to point out that the author’s report does not 
quite achieve this purpose. This is not only because of the lacunae—it is 
e.g. hardly possible to understand what happened from 1832 to 1843 without 
knowing anything about the course of events from 1822 to 1831—but also 
because the author refers hardly at all to developments in individual indus- 
tries though elsewhere he amply proves that he is not blind to their relevance 
for the cyclical process: primarily, the report (with occasional references to 
railroad building) runs in terms of financial and political determinants of the 
“general business situation,” a practice which is often defended on the 
untenable ground that, business cycles being a phenomenon that pervades 
the whole economy, explanatory factors cannot be looked for in the growth 
and decay of individual industries. Chapters XV-XXII on barometers, 
forecasting, and policy which follow upon the presentation of historical facts 
are all that can be expected from a textbook and duly emphasize novel pos- 
sibilities and ideas. 


140 





CO OR OTD SES Eel —c(<i<éi‘ 


BOOK REVIEWS 141 


Seasonal variations and the problem of their elimination having been 
dealt with in Chapter II, the reader is then introduced to the phenomenon of 
long-period industrial growth that is represented by the trend of total phys- 
ical output. The case for choosing this criterion is set forth with care but 
the pitfalls that beset trend analysis are not brought home to the reader quite 
as effectively as might be desirable. This fitting of trends to time series is a 
dangerous procedure that makes sense only under very restrictive assump- 
tions, both of a statistical and of a theoretical nature. So far as the general 
impression upon the reader’s mind is concerned, the author’s methods are 
suggestive of the old Harvard method. It is true that he also refers to the 
method of the National Bureau of Economic Research and in addition to 
that of Ragnar Frish as simplified by L. A. Maverick and to various smooth- 
ing devices. Perhaps the students for whom the book is intended, cannot be 
trusted to absorb much more than this. But I do hope that in future editions 
which I feel confident will be called for, the author will insert, at least by 
way of warnings, something about autoregressive stochastic setups and the 
tricks that they are likely to play on us. 

From inspection of contour lines, after having taken care of seasonal 
variation and the growth trend, the author infers the existence of “secondary 
trends.” He recognizes that this “secondary trend is as truly a cycle... 
as is the business cycle...” (p. 70) and that it is of “key importance,” also 
for forecasting (p. 77). Nevertheless he refuses to call it a cycle on the ground 
that it is “a long drift rather than a movement showing rapid reversals.” 
Such conceptual arrangements are largely matters of taste and expository 
convenience and there would be no point in debating them. I submit, how- 
ever, that the conceptual arrangement preferred by the author obscures the 
important similarity of mechanism that exists between his cycles and his 
secondary trends. This similarity stands out clearly if we observe the rever- 
sals that characterize the properties which ended in major crises and seem 
to be parts of longer swings as is suggested, e.g., by the English series of 
percentage unemployment of trade-union members. The author defines busi- 
ness cycles as an oscillation that averages about 34 years in length (p. 150) 
—though no periodicity in the technical sense is asserted—a statement that 
does not agree very well with the evidence presented on pp. 241 et seq. 
Questions of amplitude, length of phases, and agreement or discrepancy 
between cyclical movements in different countries (chapter X) are dealt with, 
on the basis of standard indices, in a thoroughly commonsense fashion. 

The author’s own analytic apparatus (“theory”) as well as his attitude to 
other theories is catholic. He distinguishes commendably between “originat- 
ing causes” and “conditioning causes of the self-generating oscillation”— 
which distinction is roughly equivalent with that between impulses and 
mechanisms of propagation—and also between exogenous and endogenous 
theories. The usefulness of these distinctions, which I believe are now 
accepted universally or almost so, is somewhat impaired by the fact that they 
interfere with one another: impulses are not independent of the situations 
created by mechanisms of propagation and vice versa. Also, there are slight 
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hazinesses of definition* which the author eliminates by enumerating what 
he is going to consider as originating causes and exogenous factors (pp. 96 
and 161). Both lists might be coordinated more closely but for the rest few 
people will disagree with them or with the comprehensive statement of his 
own position (pp. 154-6). Accordingly, in chapters VII-IX, what are usually 
referred to as “business-cycle theories” are introduced as different emphases 
on, or contributions to, different aspects of the phenomenon rather than as 
competitive explanations that are hopelessly at variance with one another— 
a most encouraging sign of the times. The selection is rich and felicitous, 
criticism reserved and well balanced. Sketchiness in spots—the “econometric 
approach” gets little over three pages—is justified by the purpose and char- 
acter of the book. Many seasoned fellow economists whose main interests 
lie elsewhere could do worse than studying it in spite of the protests that 
any specialist—historical, theoretical, and statistical—might feel inclined to 
raise. 


Probability and Induction. William Kneale (Fellow of Exeter College, Lecturer 
in Philosophy in the University of Oxford, Oxford, England). Oxford University 
Press, Amen House, London E.C. 4. 1949. Pp. viii, 264. $3.75. 


REVIEW BY JOHN H. SmitTH 


Department of Statistics, American University 
Washington 6, D. C. 


HIS book is based on lectures to students of philosophy. It is intended for 

readers interested in philosophical problems suggested by the title. Al- 
though the author states specifically that the book is not a treatise on the 
mathematical theory of probability nor a practical guide to the scientific 
method, the parts of the book which deal with these topics are precisely those 
which would be of greatest interest to readers of this Journal. 

A considerable amount of effort is devoted to the problems of definition 
in the theory of probability. In this connection, the author brings in most of 
the unnecessary difficulties connected with the task of incorporating in the 
same statement the definition of a theoretical concept and the practical 
conditions under which it would be appropriate to apply it. For example, on 
page 158, he says that the theoretical meaning of point, line, and plane in 
geometry are determined by the postulates and it is no objection to the 
system of geometry that it cannot be applied to experience. He goes on to 
contrast the situation with respect to probability in which the philosopher 





* For instance, most writers, especially Tinbergen, mean by an endogenous theory a model that 
accounts for oscillations solely by the relations assumed to exist between the cyclical variables, so that 
the development of the system is determined by its initial conditions. This will exclude—and make 
exogenous—some purely economic factors that arise from within the business sphere, It might be more 
useful to identify exogenous with erternal factors, i.e. with factors that act upon the economic process 
from without the business sphere. 
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is not “to construct a formal system with consistency and elegance for his 
only guidance. His task is to clarify the meaning of probability statements 
made by plain men.” In all this he seems to assume that the attitude common 
in connection with applications of pure mathematics would not be appropri- 
ate in connection with applications of probability. The reviewer takes the 
opposite position. It is possible to define probability as the proportion in the 
appropriate part of a universe or sampling distribution. Most other problems 
of definition of probability concepts arise in connection with outlining condi- 
tions under which applications would be appropriate. These questions must, 
in the final analysis be settled according to the judgment of the research 
worker, which good discussions of probability should help to develop. 

The suggested approach implies for example, that the probability that a 
toss of two dice will result in eight spots on the upper faces is the proportion 
in the sampling distribution of sums of samples of two selected with replace- 
ment from a universe composed of the integers 1-6 inclusive. Of course, one 
should not make this application in the case of loaded dice for which it 
would be necessary to make some obvious modifications. Nevertheless, the 
simplest form of the theory provides the basis for all the laws of probability 
which should, of course, be extended to the case of alternatives with unequal 
probabilities and to irrational numbers as well as to the continuous case. In 
the statement of such a theory, one should have the same attitude as he 
would have in the exposition of plane geometry; he should take for granted 
that anyone applying the theory should be responsible for deciding whether 
or not conditions are appropriate for its application. 

In his further discussion of probability, the author considers various con- 
cepts which are interesting from the philosophical point of view: such as, 
Bernoulli’s theorem, Poincaré’s equalization theorems, the principle of indif- 
ference or equal distribution of ignorance, inverse probability, Bertrand’s 
paradox and so forth. Considerable attention is devoted to ideas presented 
by Keynes in Treatise on Probability. 

The concluding chapters deal with problems of induction classified as to 
primary and secondary and as to natural laws and probability rules, In 
connection with primary induction, the author states that “the most we can 
say is that we have looked for things which are both a and B but have not 
found them” Later he speaks of searching continually for counter-evidence 
The term secondary induction is used in dealing with “theories as opposed 
to laws or probability rules.” Just as in the case of primary induction of 
natural laws theories are expected to explain, secondary induction is proposed 
as a policy—a device which meets with the reviewer’s approval. 

The interesting discussion of the scientific method is made alittle awkward 
by repeated emphasis on the sense in which one should not refer to induced 
principles as probable. This is not surprising in view of his statement (p. 222) 
that “the result of the statisticians’ work on a sample is not an inference to 
something beyond the sample, but a lucid presentation of the known facts 
about the sample.” 
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Quantitative Methods in Psychology. Don Lewis (Professor of Psychology, State 
University of Iowa). Iowa City, Iowa, The Book Shop, 114 East Washington St. 
1949. Pp. 286. Paper. 


Review BY Purire Desinp 
Statistician, Navy Department, Washington 25, D.C. 


HE author has prefaced his book, “Quantitative Methods in Psychology” 
with the observation that his students “had not yet learned to think in 
quantitative terms, to discover basic relationships between variables... ”; 
hence, his aim “to teach the fundamentals of quantitative (scientific) analy- 
sis” in the application of mathematics to experimental psychology. The 
author’s aim is laudable but his weapon hardly suits the target. The inability 
to “think in quantitative terms” is a characteristic that is not restricted to 
non-mathematical students and the problem of overcoming it is invariably 
tabled in favor of the less satisfactory solution of substituting knowledge for 
mature comprehension. In this connection, this reviewer recalls an article 
by B. D. Wood and F. 8. Beers that appeared in the “Teachers College 
Record” in March 1936 under the title, “Knowledge versus Thinking” where 
stress was laid on the emptiness of the generalization of “teaching to think.” 
Overlooking the author’s claims for the preparation of this book, however, 
and examining the text itself, one finds a general treatment of a great variety 
of basic mathematical concepts. There is certainly a great need for such a text 
in a statistics curriculum for students whose knowledge of mathematics has 
not kept pace with their studies in statistics. 

The following chapter headings serve to show the coverage: 

Variables, Constants and Functional Relationships; Fitting Curves to 
Empirical Data; Logarithms; Differentiation; Integration; The Normal 
Curve; Distribution Functions; Applications of Equations; and Goodness of 
Fit. 

Probably, because of the extent of coverage as shown above, many topics 
are treated rather sketchily and in some instances mechanically, i.e., the 
chapters on “Differentiation” and “Integration.” One gross omission is the 
author’s failure to include a chapter on basic probability concepts which are 
essential for establishing a firm basis in statistics. However, some probability 
concepts are introduced, ‘though somewhat abruptly, in the chapters on the 
“Normal Curve” and “Distribution Functions.” The author makes use of 
“degrees of freedom” but makes no attempt to explain the concept. The 
chapter on “Goodness of Fit” may be somewhat confusing to students pri- 
marily because of the abundance of categorical statements (which are not 
always correct) without theoretical justification. 

This text comes in for its share of misleading statements emanating pri- 
marily from the loose presentations which usually occur in attempting to 
simplify statistical theory for students with limited mathematical training. 
Typical examples of the type of loose and inexact statements found in this 
book are as follows: (a) “Another common application of x? is to examine the 
frequencies of a contingency table in determining whether or not the two 
variables are mutually independent” (Page 182), (b) “By chance, x* varies 
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in value from zero to »” (Page 260). By and large, however, the author 
succeeds in avoiding serious errors found in other texts. The author’s choice 
of language is sometimes misleading; for example, Chapter 4, dealing exclu- 
sively with curvilinear functions, is referred to as “. . . complex functions.” 
+ should be realized that “complex functions” has a well recognized connota- 
tion in mathematics which is not synonymous with “curvilinear functions.” 
This book is the result of the author’s many years of experience in teaching 
experimental psychology on the graduate level. It contains an abundance of 
illustrative problems in the field of experimental psychology as well as a 
sufficient number of exercises which should prove of interest to students of 
psychology. 
It is wrapped in a paper cover and is reproduced in lithoprint. The dia- 
grams are clear and the printed text is well spaced and quite readable. There 
is no index but the table of contents is fairly detailed. 


Acceptance Sampling: A Symposium given at the 105th Annual Meeting. 
American Statistical Association, 1603 K Street, N.W., Washington, D. C., 
March 1950. Pp. 155. $1.50. 


Review By H. A. FREEMAN 


Associate Professor of Statistics, Massachusetts Institute of Technology 
Cambridge, Mass. 


~ volume comprises a series of papers and discussion on acceptance sam- 
pling inspection delivered January 27, 1946 at the 105th Annual Meeting 
of the American Statistical Association in Cleveland. There are two parts, 
acceptance sampling by attributes (78 pages) and by variables (71 pages). 
There is a foreword and a closure by the Chairman, John Tukey. 

First, attributes. This opens with a brief, well-written probe by Paul Peach 
into the earlier history of the field, touching lightly non-statistical develop- 
ments and more thoroughly the development of sampling concepts. The pio- 
neering contributions of Shewhart, Dodge and their Bell colleagues are noted. 
Attention is paid to the similarity of the development of sampling inspection 
theory and statistical theory, in particular, to the interesting correspondence 
of Dodge’s producer and consumer risks with Neyman and Pearson’s errors 
of the first and second kind. Edwin Olds follows with a workmanlike account 
of the 1941-1945 period. He describes, in fair detail, the double sampling 
plans of Dodge and Romig, Simon’s grand lot scheme, the Army Ordnance 
tables, the Army Quartermaster Corps sampling plans, the Navy manual on 
sampling inspection, Columbia University Statistical Research Group’s se- 
quential sampling plans, the work of the Office of Production Research and 
Development, and the continuous sampling plans of Dodge and Wald and 
Wolfowitz. The closing bibliography is excellent. The article is a nice blend, 
heavy neither in mathematical or inspection detail. Olds makes no critical 
comparison of these plans; he has a good word for each. 

The prepared discussion is by Walter Bartky (whose important contribu- 
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tions to multiple sampling are explicitly recognized by several speakers), 
by Harold Bellinson whose remarks centered on details of the Army Ord- 
nance plans, and by David Schwartz who considered problems of the future, 
among them, the use of variables instead of attributes in sampling inspection, 
and tests of increasing severity. But it is the informal discussion that is most 
engaging. It is gay, pleasantly chaotic, and discloses a fund of wisdom as well 
as a mild feud or two. John Curtiss of the Navy, Bellinson and Schwartz 
meet somewhat obliquely on a favorite topic, Army and Navy inspection, 
and the text is soon brightened by such familiar phrases as “the traditional 
policy of the Navy ....” There are a few spirited paragraphs on zero ac- 
ceptance numbers and an instructive if randomized discussion of critical ver- 
sus non-critical defects. In the free-for-all I would declare Curtiss the winner, 
a victory which suggests that the war-time superiority of Army over Navy 
inspection may be temporary. 

Except for a reference by Frederick Mosteller to the important problem of 
estimating the proportion defective, and an incidental remark or two by 
others, there was no discussion of the statistical theory of attribute sampling; 
all discussors concentrated on practical problems. 

The second half, sampling inspection by variables, it very interesting. 
This area has seen fewer applications, and the result is a freshness which 
attribute inspection lacks. Moreover, the mathematics here is more interest- 
ing and unsolved problems are many. The two principal papers, by Curtiss 
and Allen Wallis, are excellent. While both speakers have had experience with 
applications in this relatively new field, they stay close to statistical theory. 
Curtiss develops his discussion from the basis of null and alternative hypoth- 
eses, using the apparatus of size and power. Dealing only with single sampling 
and largely with normal variables he discusses tests of hypotheses on the 
mean, with variance known and unknown, and tests of hypotheses on the 
variance. These tests are well-known but they are particularly well described 
here. Also discussed here and less well-known in inspection circles are Stein’s 
test, Bernstein’s (rather than Tchebycheff’s) inequality and the problem of 
using several parameters to measure quality. The subtle question of whether 
one is accepting a lot or a process, a question which is seldom asked let alone 
answered in the literature, is considered. Finally there are compact tables of 
various pairs of null and alternative hypotheses, best statistics and the cor- 
responding critical regions, all for single sampling. This is certainly an effec- 
tive paper and one that cannot be found elsewhere. 

Wallis’s shorter paper is equally fine; it is the one principal paper of the 
meeting that contains original results. The problem here is to test hypotheses 
on the proportion defective by use of a statistic of continuous variates. This 
leads for single sampling to non-central t, whose application to this problem 
was originally developed, I believe, by Jennett, Johnson, and Welch; Wallis 
gives a number of useful approximate formulae (some original with him) for 
this case, including that of the power function of the test. He then turns to 
an area in which he has done pioneering work—sequential tests, using statis- 
tics of continuous variates, of hypotheses on the proportion defective. This 
leads, at least in principle, to the best of best tests, for both the merits of 
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the sequential method and those of continuous variates are present. The 
prepared discussion matches the two papers, with all discussors, particularly 
Kenneth Arnold, Joseph Daly, and Alexander Mood, making solid contribu- 
tions. Thereafter, the open discussion makes a slow slide to an agreeable mix- 
ture of information and conversation. Finally, John Tukey’s excellent closure, 
classifying and summarizing the contributions of the discussors, with 
pertinent remarks of his own. 

The passage of time has not dealt kindly with this book, for interest in 
sampling inspection is not quite what it was in 1946. Moreover, some of the 
material has already been published; Wallis’s 6-page paper is contained in a 
91-page paper published elsewhere. But it is an interesting book now or any- 
time. Along with the Statistical Research Group’s Sampling Inspection (by 
attributes) and the forthcoming book by Albert Bowker and his Stanford 
associates on sampling inspection by variables, it will provide enough in- 
formation on the theory and mechanics of sampling inspection to satisfy 
almost all of us. 

A vote of credit should go to Chairman John Tukey and to his two associ- 
ates, Frederick Mosteller and Charles Winsor. The assembly of a book of 
this sort must have been an arduous task; they have done their work well. 
They deserve thanks, too, for editorial restraint. Such passages as 

“Curtiss: I mean the last thing he was doing, about subdividing the stuff. 
I thought that covered it. 
Mosteller: I thought Colonel Simon had some stuff on that in his book. 
Chairman: Well, is the discussion running down to an end or does some- 
body want to say something?” 
may contribute little except to printing costs, but they give the book, as 
they must have given the discussion, a gentle air of informality and irrespon- 
sibility, probably necessary conditions for important contributions. Neces- 
sary or not, the result is a good book, full of important fact and authoritative 
opinion; in fact, it’s full of stuff. 


BRIEFER NOTICE 
Demographic Yearbook, 1948. Prepared by the Statistical Office of the United Na- 
tions in collaboration with the Department of Social Affairs. Lake Success, N. Y., 
1949. Pp. 596. $7.00. 

Nn March 1947, the Economic and Social Council recommended that the 

United Nations publish “ . . . a demographic yearbook, containing regular 
series of basic demographic statistics, comparable within and among them- 
selves, and relevant calculations of comparable rates....” The result of 
this recommendation now appears in the form of a comprehensive and sys- 
tematic collection of contemporary demographic data. 

This volume contains 39 basic international tables relating to the following 
subject matter: population, births and deaths, marriages and reproduction, 
life tables, and migration. An introductory section provides historical back- 
ground on previous compilations, and in addition, indicates some of the 
statistical pitfalls associated with demographic data, such as: (1) under- 
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enumeration and over-enumeration in population counts; (2) accuracy of 
reporting and completeness of coverage of vital statistics; and (3) compara- 
ability of international migration data. The concluding section of this work 
is a bibliography of more than 1,000 titles of official publications on recent 
census data and current demographic statistics. 

Unlike many statistical compilations, the present study is attractive to 
the reader. The format is clear, the tables have been carefully annotated, 
and the volume is well bound. These features are especially welcome in a 
volume designed primarily for reference work. This volume should prove 
very useful to the specialist as well as to other readers who may be interested 
in demographic statistics. 
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ERRATUM 

In the December 1949 issue of the JourNaL (p. 572) an unsuccessful 
effort was made to print a misprint. The last paragraph of Professor Mostel- 
ler’s review of Palmer O. Johnson’s Statistical Methods in Research should 
have read “Perhaps the best thing to say is that Prentice-Hall’s advertise- 
ment for the book is correct in all particulars—except one, I doubt if the 
author did research with F. A. Fisher (sic), although the influence of R. A. 
is clear.” After the galley left our hands F. A. was changed to R. A. 
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